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American Documentation is a publication of the Ameri- 
ean Documentation Institute. It is a scholarly Journal in the 
various fields in documentation and serves as a forum for 
discussion and experimentation. Papers already published or 
in press elsewhere are not acceptable. For each proposed 
contribution, one original and two copies (in English only) 
should be mailed to Mr. Arthur W. Elias, Editor, Ameri- 
can Documentation, Institute for Scientific Information, 
325 Chestnut St., Philadelphia, Pennsylvania 19106. Thé 
manuscript should be mailed flat in a suitable-sized en- 
velope. Graphie materials should be submitted with suitable 
cardboard backing. 

Types or Махсвсвіртя: Three types of contributions are 
considered for publication: full-length articles, brief com- 
munications of 1,000 words or less, and letters to the editor. 
Letters and brief communieations can generally be pub- 
lished sooner than full-length manuscripts. Books, mono- 
graphs, and reports are accepted for critical review. Two 
copies should be addressed to the Review Editor, Dr. 
T. Hines, 54 North Drive, East Brunswick, New Jersey. 


Ркосвввтха: Acknowledgment will be made of receipt of 
all manuscripts. American Documentation. employs a re- 
viewing procedure in which all mansucripts are sent to two 
referees for comment. When both referees have replied, 
copies of their comments are sent to authors with the 
Editor's decision as to acceptability. The refereeing pro- 
cedure requires about 30 days. Authors receive galley proofs 
with a five-day allowance for corrections. Standard proof- 
reading marks should be employed. Reprint order forms are 
forwarded with galleys. 


Format: All contributions should be typewritten on white 
bond paper on one side only, leaving about 1.25 inches (or 
3 cm) of space around all margins of standard, letter-size 
(8.5 X 11 inch) paper. Double spacing must be used through- 
out, including the title page, tables, legends, and references. 
Тће first page of the manuscript should carry both the first 
and last names of all authors, the institutions or organiza- 
tions with which the authors are affiliated, and notation as 
to which author should receive the galleys for proofreading. 
All succeeding pages should carry the last name of the first 
author in the upper right-hand corner (0.5 inch from the 
top) and the number of the page. 


Sryte: In general, style should follow the forms given in 
the Style Manual for Biological Journals (SMBJ), published 
for the Conference of Biological Editors by the American 
Institute of Biological Sciences (1964). 

Тітік: The title should be as brief, specific, and deserip- 
tive as possible. Vague and unrevealing titles may delay 
publication. 

Аввтваст: An informative abstract of 200 words or less 
must be included, typed with double spacing on a separate 
sheet. This abstract should present the scope of the work, 
methods, results, and conclusions. 

ACKNOWLEDGMENTS: Financial support may be listed as 
a footnote to the title. Credit for materials and technical 
assistance or advice may be cited in a section headed 
“Acknowledgments,” which should appear at the end of 
the text. General use of footnotes in the text should be 
avoided. 

GRAPHIC MATERIALS: American Documentation. requires 
finished artwork. Follow the style in current issues for lay- 
out and type faces in tables and figures. A table or figure 
should be constructed so as to be completely intelligible 
without further reference to the text. Lengthy tabulations 
of essentially similar data should be avoided. 

Figures should be lettered in black India ink. Charts 


drawn in India ink should be so executed throughout, with 
no typewritten material included. Letters and numbers ap- 
pearing in figures should be distinct and large enough so 
that no character will be less than 2 mm high after reduc- 
tion. A line 0.4 mm wide reproduces satisfactorily when 
reduced by one-half. Graphs, charts, and photographs should 
be given consecutive figure numbers as they will appear in 
the text; however, figure numbers and legends should not 
appear as part of the figure, but should be typed double 
spaced on a separate sheet of paper. Each figure should be 
marked lightly on the back with the figure number, author’s 
name, complete address, and shortened title of the paper. 

For figures, the originals with two clearly legible repro- 
duetions (to be sent to referees) should accompany the 
manuscript. In the ease of photographs, three glossy prints 
are required, preferably 8 X 10 inches. 

OnGANIZATION : In general, papers should state the back- 
ground and purpose of the study, followed by details of 
methods, materials, procedures, and equipment. Findings, 
diseussion, and conclusions should appear in that order. 
Appendixes may be employed where appropriate for ex- 
tensive lists, statistics, and other supporting data. 


BIBLIOGRAPHY: Accuracy and adequacy of the references 
are the responsibility of the author. Therefore, literature 
cited should be checked carefully with the original publica- 
tions. References to personal letters, abstracts of verbal 
reports, and other unedited material may be included. If 
an as-yet-unpublished paper would be helpful in the evalua- 
tion of a manuscript, it is advisable to make a copy of it 
available to the Editor. When a manuscript is one of a 
series of papers, the preceding member of the series should 
be included in literature cited. 
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Editorial 


Documentation Abstracts 


The Editor takes great pleasure in calling the attention of ADI members and 
American Documentation subscribers to the advertisement appearing on the back cover of 
the present issue. The publication of Documentation Abstracts as a separate publication, 
under the joint auspices of the American Documentation Institute and the Division of 
Chemical Literature of the American Chemical Society, had been a cherished dream of 
myself, Hans Peter Luhn, and many others for so long that it often seemed only a 
dream. It would be impossible to list all those who have lent support and encouragement 
to this effort, but the names of a few must be recited: Dr. Herman Skolnik, of the 
Journal of Chemical Documentation; Mr. Charles Bourne, long-time Editor of the 
Literature Notes Section of American Documentation; and Mr. John Markus and Miss 

. Mary E. Stevens, of the ADI Publications Committee, have rendered yeoman service. 

Documentation Abstracts, whose logo appears on the cover of this issue, will appear 
quarterly with issue Number 1 appearing in February. It is estimated that each issue 
will contain 500 abstracts from a unified coverage list representing the interests of ADI, 
the Chemical Literature Division, and the Special Libraries Association—Sci Tech 
Division. The latter group is represented in this project through the efforts of Mr. 
Charles Kip, who was Editor of “Documentation Digest.” Mr. Kip has worked with the 
group designing the publication and has secured the continuing cooperation of the 
“Documentation Digest” abstracting group. 

The Editor is happy to report that by action of the Council of the American 
Documentation Institute taken December 16, 1965, all members of the Institute will 
receive the new: publication at no charge. Every effort will be made to continue this 
policy. However, members can help in this effort by promoting outside subscriptions 

' to.the new publication. The charge for a subscription in 1966 has been set at a minimal 
fee of $8.00. 


Good luck to Documentation Abstracts! 
A. W. Euas, Editor 
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An On-Line Technical Library Reference Retrieval System 


In October 1964, Lockheed Missiles & Space Company 
(LMSC) started to experiment with an on-line refer- 
ence retrieval system which uses a coordinate search 
strategy. Installation of the retrieval system was 
greatly facilitated by the existence of the LMSC on- 
line Automatic Data Acquisition (ADA) system which 
provided the vehicle for this application. 


©! Introduction 


Information retrieval is а vitally important problem 
and а very popular area for research. Consequently, there 
exists an information explosion concerning information 
retrieval resulting in many debates and discussions in 
information-processing circles. However, no general 
digeussion will be presented in this report, since the 
System to be described here is designed to retrieve refer- 
ences, Le. the names of documents, not to retrieve in- 
formation. Аб present, the system does not concern 
itdelf directly with the major intellectual and practical 
problem in this area — that of resolving the conflicting 
requirements that a document be described briefly, and 
then that this brief description be adequate for later 
retrieval of that document. | 

Working reference retrieval systems involving the use 
of digital computers are no longer uncommon and have 
beeome highly sophisticated in their search strategies. 
Almost invariably, the procedure used is to accumulate 
a number of requests, code each in several alternative 
search formulations, and make a computer run with the 
batch. This method has the virtues of simplicity and 
economy, especially when the material to be searched can 
be loaded on a semirandom-access storage device (such 
as a dise or drum) and organized in an inverted file. 


Experience with the current hardware-limited design 
has led to a more flexible "conversational" approach 
which is currently being implemented. The current 
system and the second-generation design using a 
"dialogue" are briefly described. 


t 
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This method has several inherent disadvantages as well. 
The principal drawback is that of any batch-processed 
computer job: Once the job is submitted, the die is cast. 
Aside from some mechanical internal options, there can 
be little flexibility in the computer processing, and no 
monitoring of program execution. Turnaround time can 
also cause difficulties. When each successive refinement of 
а, search request imposes a delay of a day or more, fewer 
refinements сап be introduced. Another disadvantage is 
that in most batch-processing systems, an interpreter 
is required to translate the user's request into an ac- ж 
ceptable system formulation so that the search ean 77 
made. In general, it seems that the more intermediar, 
interposed between the eventual user and the documen, 
collection, the greater the chance for distortion of they 
user’s intention. 

There is an alternative to this well-worked approach, 
and one which is just beginning to be explored. This is 
the possibility of retrieval via an on-line dialogue. This 
approach has been under study by Kessler of the Massa- 
chusetts Institute of Technology (1), implemented in a 
small experimental design by Salton of Harvard (2), and 
is realized in the current Lockheed working system. 

There is a general feeling that on-line retrieval is the 
next major development, and represents the retrieval 
system of the foreseeable future (3 and 4). The basic 
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reasons that the LMSC group endorses this view are wes 


| sortie sources of semantic ' distortion | are “eliminated by ` 


putting the.user directly 1 in ‘contact. with the system and 


"dts vocabulary; ` (b) the use.of a dialogue permità the . 


.user to "develop & highly ‘complex search formula by a 
' series of, simple steps; (e) the! tesult (negative ог-рові- 


: tive) of опе search can be applied. immediately to the. - 


next so that.the overall search. сап be considered as а 


| sequence of attempts. The intelligence and, experience э) 
: of the’ user of the system are involved not just once, P 
the framing of an inflexible question, but*arecontinuously _ 


engaged. until an acceptable result is achieved. 


those . described here, just two ‘other. “on-line retrieval 


“systems, Thé SMART (2) system of Harvard University К. 


^ 4g unusual in.& number of respects. 182 offers а -Jarge 
у numbér of’ combinations of- search sttategies, a choice. 


. among a group. ‘of eclectic techniques. It-has the refresh- - 
ing quality. of a design that uses many” reasonable means ° | 
- to an end; rather than selecting one and discarding all jM 


‘competitors. “Тһе system embraces indexing as well as, 


‘retrieval and provides* si step-by-step definition ‘of: 
strategy. To date, it is an experimental tool and not an: `> 


‘operating’ function of an existing library. 


The other similar. system, developed in ¢onjunction with | 
` Project "MAC at Massachusetts Institute, of ‘Technology ` 


(MIT), was reported by Kessler-and- collaborators (3) the, 

same week the Lockheed system was fitst tested. These 
` two: independent developments resemble^each other in 
- many. respects, bit there-are. three ‘major "differences: 


1, The MIT- ‘system ів currently dealing with a spe- . 


' eialized. scientific’ collection (physics journals) ‘as 
"opposed to’ report literature. іп. many fields. | 


Е "The MIT development already - uses the teletype 


‘terminals which represent the-next step in the evo- 
: Jution of the Lockheed system. - 


` 3. The datà base of bibliographie information in e 


2 MIT file was apparently designed for the ex- 
. .periment whereas the Lockheed design uses ап 
А existing "machine-readable master catalogue. card; 


No desctiption of, methods of vocabulary control: от ; 


Р translation’ facilities i is‘given in, the cited reference. . | 
Thèse differences have had great. influences on seardh ` 
strategies.- An example 1 is.the strong bias toward citation 
tracing in the MIT désign, which séems much more ap. 
_ propriáte toa joùrpal than’ to & report- collection. Other 


. differences" may. be noted y using Reference 1 and Ње ` 


“present report. iru 


ix EP P 


і jl Converse I "The: Current System, | 


-upon twó other operating Lockheed systems.. ‘The first, 


MATICO (Machine-Aided Technical Information Center А 


| Operations) ,, supplies: the machine-readable data base. 


-The second, DE (Automatic Data Acquisition), supplies ~ 


ЗЕ” > Сет Documeintation—January 1900. — 


_ persona] apthor, 


лы торур MOT > > PB 
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~ the ait ling computer and its time-sharing monitor, The ' 


D existence of these facilities permitted ‘the . designy. „де- | 
| velopment, and initial testing of the present system ina 


6-week periód. The accumulated produet of the MATICO 
progràm will allow the inclusion in ‘the retrieval system of 


the entire. LMSC. in-house report collection of 100,000 - 


titles , ag Boon as the system can “accommodate thein. D 


M 


THE SYSTEM AS BEEN BY THE USER» 


The input. device as a l-card reader which: has а” 
series of levers-permitting thé input also of а 10-decimal-. 


“As far as ‘canbe determined, there, exist, ‘aside’ tion i digit number. (As will be described later, this device 18 to 


be replaced in a later version by: ‘a more’ convenient: 
mechanism.) The adjacent output device is a. teletype n 


| printer. Beside the, card reader is a file cabinet contain- hae. 


ing prepunched cards prepared forihserlion in the reader, 


. Each card contains а term or number describing ойе ог 


more-of the documents in the collection. | 
"The types of information used às descriptors are: Nu | 
(b) all ‘significant : title words, ' (е)... 


: corporate ‘author, (d) all subject headings (an: average a 


' _ of three per report), (e) contract riumber,.(f) original 


report number, (g) „secondary report number, and. (h) 
date of püblicatjon. А file was also generated describing! 
security classification, but was too broad. for use vin | 


`. the présent system. 
Aside from these décor cards, two ан опа а 
cards are used, the мот card and the END card,” If : 


de&eriptor cards are inserted with по, intermediate con- 7 


- trol cards, they are considered to specify а logical-product, 


” 


` selected which will relate to the desired information: Con- ` | 
` tinue until your certain, knowledge of the document. or, 


search prescription. Ап intermedi&te мот card negates 
the deseriptor immediately following, and thé END card s 


- jnforms the program that, the- search prescription, ‘is 


complete.-’ sa s E. 
Instructions to the atin user.are: · у E. 
1. Select а deseriptor for the desired report from one . 

of the categories of information, and find the correspond- 


ing inquiry card in the labeled card file drawers. If there _ 
is no such inquiry card, another descriptor should be. 


subject is exhausted. Tf it is desired to exclude’ somes . 
document ‘references, ` each descriptor’ card- -using а: ега 
to be excluded from tlie document description is preceded 
by а мот сага, The inquiry card contains (left-to-right) | 


. the prefix categorizing the information on this card, à 


: eode number for the descriptor, thé descriptor itself, and | 


at the far right the nurnber of documents in the file using . 
this- ‘descriptor. (If this number ‘is. 1, clearly no further: 
cards aie needed’ to retrieve the unique document refér- 

ence.) 


`2. Place ДЕ first 1 inquity саға, ‘with ORE A m 
ар and facing the user, їп the long card slot on the top `. ` 


. of, the m&ehiüe. Enter the inquiry. бата by depressing - 


< Те. system йл js:now working at. LMSC depends’ " . the metal bar оп the back of the card ‘slot. Repeat the 


procedure for each new card input. When all the i inquiry | E 


‘cards have been entered, insert ап END card. 


-3. The Tesponse to.the i inquiry will be ‘printed out ‹ on | 
the teletype printer in oné of three ways: . . | 


та 1. 1-3 documents аге е found, the total 1 information. 


on an edited library eatalogue card will be printed 
out for each document. 


b. If 4-15 documents are found, only the correspond- 
ing document reference numbers will be printed 
out. In this case, there are two options for the next 
procedure: (а) "refine" the information request 
by inserting more descriptor eards, or (b) enter 
the document reference numbers by using levels 4 
through 9 of the card reader. When an Ew» card 
is then entered, the corresponding catalogue in- 
formation will be printed out. 


c. If more than 15 documents have been found, the 
printer wil only write the number of documents 
that meet the request. In this case, the inquiry 
request should be refined. . 


4. Тһе information request can be refined by adjusting 
the lever on the extreme left so that а "1" appears in 
the viewing window. Another card is then selected and 
entered. The subject area may be limited by inserting 
а NOT card before the area of information not desired, 


or further restricting deseriptors may be entered until . 


ihe total number of documents retrieved is less than 15; 
The teletype printer will respond after each card inser- 
tion, giving the information appropriate to the number of 


document references retrieved. 


5. The information on a typical edited library catalogue 
card is as follows: 


> 
45931 GDC-U414—61—905 37Р UN 
GENERAL DYNAMICS CORP. ELECTRIC BOAT DIV., GROTON, CONN. > 
DIGITAL SIMULATION OF A CONFORMAL DIMUS SONAR SYSTEM. 


PHASE 1. 
ARNOLD, C.R. FEB 61 C-SV > 


DIMUS (DIGITAL MULTIBEAM STEERING)/SONAR SYSTEMS/HYDROPHONES> 


‚ AD-265 398// > 


6. Four further responses may be made by the system. 
"NO REPORTS FIT YOUR INQUIRY" may be. printed if there 
ате no reports corresponding to a given combination of 
inquiry cards. "nEPORT мо. 000000 18 NOT ON FILE" 


‘indicates that there is no report on file in the system 
with the requested reference number. This might be due 


to an error in keying-in the number on the variable 
levers. “ILLOGICAL 18Т TERM. PLEASE RESUBMIT.” indicates 
that the request has started with an END or wor card, 


` neither of which is permitted as the first card. The phrase 


“END OF RESPONSE” follows each system response. 


' BEHIND THE SCENES 


The retrieval files are made up from the information 


| on library “master catalogue cards.” This information is 


then entered on up to 44 key-punch cards per catalogue 
card. Two files are created and are linked by the accession 
number of the document. The first, the inverted file, 
is derived from an “explosion” of the catalogue master. 
In this operation descriptors corresponding to all of the 
categories mentioned previously are identified and sepa- 
rated, are labeled with the proper letter prefix, duplicates 
are combined, and the file is alphabetized within the 
descriptor category. This file is then output in a suitable 
form for editing with each card displaying a descriptor 
followed by the card or cards containing the correspond- 
ing document accession numbers. This card file is then 
edited manually. File entries whose descriptors are non- 
significant terms are purged (using a list of 585 such 


terms compiled by Bell Laboratories) and synonymous 
descriptors are combined by grouping them together. 

The final stage of the inverted-file program reformats 
the edited information and outputs both an inquiry card 
deck and the inverted-file tape which will be loaded on 
ihe retrieval file. This step of the program assigns an 
arbitrary sequence number. to each descriptor. If several 
deseriptors have been identified as synonymous, each is 
assigned the same sequence number. 

It will be evident that this design gives a tight control 
of vocabulary. Every term which can be used in the 
form of an inquiry card to query the system must exist 
within the inverted file, and vice versa. 

The second file, the “catalogue file" of complete docu- 
ment descriptions, is a machine edited and reformatted 
version of the catalogue master. Editing eliminates 
redundancy, and reformatting prepares the information 
for output on the teletype printer. 


OPERATING EXPERIENCE 


. The first realistic file used in the system was hand 
coded and consisted of 100 document references. Since 
then the file has been expanded twice: first to 1,600 and 
then to 8,000 references. An interesting sidelight is the 
growth in the number of distinct descriptors with respect 
to growth in file size. Some, such as report numbers, 
naturally increase more or less linearly. Some others are 
described in Table 1. 

These numbers display some interesting tendencies. The 
corporate author file is almost complete at 8,000 refer- 
ences. In fact, at 3,000 documents, the file is nearly 
as large as at 8,000. The asymptote is roughly 1,000 
corporate sources. 

The number of distinct title words is also approaching 
saturation, but the collection is not yet large enough to 
yield a good estimate for the limiting value. The other 
three descriptor categories are, at the 8,000-document 
level, in a region of linear increase. It seems likely that 
the slope of the personal author lime and the contract 
number line will continue with slight change as the col- 
lection increases. The subject headings, being combina- 
tions of terms, are potentially much more numerous than 
the individual terms; these will eventually level off, but 
the tendency is not visible as yet. 

In summary it might be said that the available sta- 
tistics indicate that three numbers, .26 contracts per 











TABLE 1. 
File size 
100 1,600 8,000 
Authors 97 1,223 5,781 
Title words 359 3,015 7,289 
Subject headings 281 2,781 10,083 
Contract numbers 53 522 2,200 
Corporate sources 74 763 933 
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document, 71 personal authors per document, and about 
1,000 corporate sources, can be used as estimates to char- 
acterize the entire collection. Although this describes a 
specialized information center, it is an interesting observa- 
tion that the study of about 3,000 documents should pro- 
vide these statistics for other collections. 


USER REACTIONS 


As the utility of the system is directly related to the 
number of documents represented, it was decided not to 
Tisk large-scale use (and possible user disappointment) 
until the most recent increase to 8,000 documents. The 
period since this increase has not been long enough to 
afford reliable statistics about the reactions of the users 
of the system. А 

However, some patterns have become evident. Users 
have grasped the rationale of the system quickly and 
easily and, in the enses observed, have used it effectively 
and been satisfied with the results. Two sources of com- 
plaint are the awkward card input device and the fact 
that the collection of documents is not larger. An inter- 
esting remark was made by several users to the effect 
that it made them impatient to wait 60 seconds while 
their catalogue cards were being teletyped, even though 
they felt they were saving hours or days of manual search. 
It would seem that an automated system, to be completely 
satisfactory, has to respond within а few seconds and 
should present output results at roughly a normal reading 
rate. 


€ Converse П: System in Development 


.. The statistics displayed in the previous section, relating 

the growth in the number of descriptors to the number 
of documents represented, clearly indicate the existence 
of a practical upper limit to the size of the collection in 
the Converse I retrieval file. For 8,000 documents, 38,285 
inquiry cards were generated. 

One way to avoid the deluge of prepunched cards and 
simultaneously maintain vocabulary control would be 
to furnish a key-punch machine and an authority list 
at each inquiry station. This solution was rejected as 
awkward for the user, disturbing to other library patrons, 
and in other ways inferior to the system sketched below. 

Converse II is designed around a teletypewriter inquiry 
station and the principle of allowing a “free” vocabulary 
at the start of the retrieval dialogue. The overall search 
divides itself into three phases. The first phase relates 
the user’s vocabulary to the system vocabulary to find 
appropriate search terms, the second phase is a pre- 
liminary search to restrict the collection to a number 
small enough to analyze in some detail, and the third 
phase is a detailed search of the small file. 

The first phase requires that different forms of the same 
term must be recognized. This demands a study of suffixes 
and the development of a practical program designed to 
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identify the variants in most (say 90%) of the cases. 
Another aspect of the problem demands that a thesaurus 
be made available to the user, and that this thesaurus 
be related to a parallel wordlist: that of.the content- 
descriptive terms derived from the collection. It is fore- 
seen that each time a word is output from the thesaurus 
it will be compared to the wordlist, and if the program 
finds an identical or variant term as an actual descriptor, 
this fact will be indicated by the display of a number 
(the number of documents labeled by this term) after 
the output thesaurus term. The thesaurus word and the 
list variant are identified for the purpose of the search. 

Another problem that arises in the first phase is that 
of partial identifieation. Provision will be made to relax 
the demand for exact matching when searching on per- 
sonal and corporate authors. In the case of personal 
authors, the obvious change is to permit the use of the 
last name with one or no initials. In the case of corporate 
authors it will be necessary to incorporate a “thesaurus” 
to identify not only variant usages but to permit the use 
of initials, ete. 

The second phase of the search is basically devoted to 
creating subfiles of the total file, and uses Boolean in- 
verted-file manipulations to do so. The process might be 
compared to a “coarse sieving,” with the purpose of 
eliminating the references which scem clearly irrelevant 
so that a detailed search of the remainder becomes 
feasible. One feature of such a process is that a subfile 
created by a given formula cap be used as a unit in the 
next formula so that the terms used in its creation are 
combined and there is no need to repeat the process 
that created the file. This use of subfiles is clearly re- 
cursive. Another point is that these subfiles are not 
destroyed until the total search is at an end, so that 
partial backtracking is facilitated. 

The third phase, that of serial search, will permit the 
use of dates such as “published after June 1962” and the 
use of security classifications as identifiers. Eventually 
this phase will be developed to a greater degree of 
sophistication; but there is a limit in the basic data set 
itself — almost all the information on the catalogue card 
is being exploited already. When abstracts become avail- 
able to the system, many further refinements will be 
made. This is the area in which most future develop- 
ments are to be expected, so an effort is being made to 
keep it flexible and open for new kinds of search tactics. 

Plans have been made to place a terminal in the area 
where the library staff indexes incoming material. This 
terminal will provide access to the current thesaurus 
which is a locally modified version of the Engineers Joint 
Council retrieval thesaurus (5). the retrieval wordlist, and 
the corporate-source authority list, so that the indexing 
and retrieval vocabularies will become as similar as 
possible. 

As a parenthetical note, it should be added that a 
further modification, Converse ІП, is under consideration. 
The major advantage of this future development will be 


the availability of a CRT output device to allow the  . 2. Sauron, G. 1964. A Document Retrieval System for 


retrieval program to become self-explanatory (а Man-Machine Interaction. 1964 Proc. ACM. Compu- 
“tutorial” mode of operation) as well as to remove most tation Laboratory of Harvard University. 

of the restrictions on output imposed by the low speed 3. Swanson, D. R. 1964. Design Requirements for a Fu- 
of the teletypewriter. ture Library. abr. and Automation. Library of 


Congress, Washington, D. C. 


Referents 4. AMERICAN Lippary Association. 1963. The Library 
and Information Networks of the Future. USAF 
1. Kzssurm, M. M, Ivm, E. L., and Marnews, W. D. 1964. RADC Tech. Develop. Rep, No. 62-614. 
The МІТ. Technical Information Project — A Pro- 5, Thesaurus of Engineering Terms. 1964. Engineers Joint 
. totype System. Proc. Am. Doc, Inst., 1. Council, New York. 
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On L Laws of Special Abilities and. the Production of 


| Scientific Literature | 


А quantitative estimate is ; made of. ‘the magnitude of _ | 


the problem posed by the. quantity of scientific and 
technical literature produced: using improved estima- 
tion procedures, 


ж Introduction А 


a 

Rivent work by de Solla. Price (1) and Bourne (2) 
emphasizes the need to provide more definitive estimates 
of the literary productivity of scientists and engineers. -~ 
It is possible that the. so-called information explosion is 
not as serious as has been thought. In any event it is time 
that more powerful: tools of analysis were brought to bear 
. on the problem of estimating the volume of documenta- 
tion because very real operational decisions depend on the 
accuracy of these estimates. : 

Previous writers have relied on estiinates of the number 
of journals and a wide range of empirically arrived &t 
‘| averages of the number of. articles. per journal. Others 
have relied on estimates of the number of research re- 
ports associated with research and development contracts. 
The ‘former method is handicapped because, as Bourne 
. points out, we are not sute how many articles there are 

in & journal per year, nor which ` articles are eligible 
to be considered as technical ‘literature. If we consider, 
ав some have, the indexing process as sufficient evidence 
of contribution, е run-in to the problem that there is 
duplication among indexes; that is, articles арр ш 
.more.than one index. `- 

nh order ` to remedy the difficulty, some ‘work ‘done 
by ‘Lotka ` ‘and others. in the field .of laws governing 
. special or unique abilities was reviewed. Lotka’s law states 
that the number of people” writing n- papers in.a life- 


time is proportional to 1/n?. Unfortunately this does not - Et 


. help to forecast annually the output of technical papers, 
hor does it take account of highly prolific authors (8). 
Fortunately there is another-kind of mathematical 
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which produce more conservative _ 


"LEROY Н. MANTELL 


estimates than those previously offered. The Mur is., 
са: ‘method of ‘estimation of literary productivity per 


man. The possibility that many contributions are pre- .- | 


sented i in more:than one publication is suggested. 


` Nasson College 
| Багно Maine 


у Е 
\- A. os ~ 


‘generalization called the Poisson йып "t can 
be used, rather effectively, for our purpose, The Poisson 
may be used tò estimate the probability of some event 
: occurring during а period of time. That is, if L is the 
expected number of arrivals in а period of time,- the’ 
"probability" of exactly n arrivals, i in'the period i8 given 
Љу Lhe7U/n| Р 
To illustrate the application of this formula, Bic 
"kiewicz' в classic example concerning deaths from the kick 
‘of a Horse in the Prussian Army is often selected, although ' 
“needless {0 :вау many more up to date and «пиу ав 
: valid, but prosaic, examples are available. | 
Von Bortkiewicz collected information concerning the 
^ number of deaths which had occurred i in а certain. group 
“of ten Prussian Army Corps over a 20-year period from 


~ 71875 to 1894 аз. а result of soldiers being kicked by a 
:; horse. 
| reports included in the study. The 200 reports in which 


He found 122 deáths recorded in the annual 


. the 122 deaths were recorded were. distributed .as shown Е 


» in Table 1. 


, This kind of information precisely fits the Pda esti- Е | 


ee ` Tase 1. 





No. of deaths per Army . 


Corps рег year ·  . No. of repórts 





` 109 


ез њ со t0 = Ф 
оз 





mating expression. And, us we will see, the number of 
learned papers or publications occurring in journals dur- 
ing a fixed time period also fits this expression. As a result, 
if we know the number of scientists and engineers, it is 
possible to estimate the number of papers that will be 
produced. 

The results are based on tabulations of the number 
of learned papers and research results reported in a 


list of 29 abstracts and journals indexes. A value for L of | 


3 is found and а new procedure due to Cohen (4) is 
used to estimate this average value when the zero fre- 
quencies are unknown. It is then possible to estimate 
the probable output of research articles and reports of 
all United States scientists and engineers working in re- 
search and development. It will be seen that this method 
of forecasting research output gives results which are 
gomewhat more conservative than others. 


* Some Background | 


‚ Measurement of literary productivity was approached 
by Lotka through the use of an index of all known 
articles to the year 1900 appearing in Auerbach’s Ge- 
achichtstafeln der Physik. He also used the decennial 
index of Chemical Abstracts for the years 1907-16, count- 
ing the number of names against which there appeared 
1, 2, 3, ete., entries for those names beginning with A 
and B (a random sample) (5). | 

[Dresden in 1922 tabulated the number of contribu- 
tions of learned papers to the 25th anniversary meeting 
of ‘the American Mathematical Society (6). Neither 
Lotka nor Dresden concern themselves with the larger 
universe of noncontributors and the latter is forced to 
group his information in a specific way in order to ob- 
tain a suitable mathematical representation. Certainly 
if a formula is to be used which will estimate the 
probability of а person producing a learned paper, it 
should not be made to depend for its accuracy on the 
way in which information is classified. Also, as has been 
noted in the case of Lotka, but not shown here, the form 
of; mathematical representation should not change 
merely because the time period covered has changed. 

We are, however, indebted to both Lotka and Dresden, 
and using their technique the numbers of contributions 
per author listed in the Office of Naval Research Human 
Engineering Bibliography 1959-1960 were tabulated with 
the results shown in Table 2 (7). 

In other words, 1,718 out of 2,255 authors listed pro- 
duced one article or report only, 332 authors produced 
two each, and so on. 

When Lotka ard Dresden attempted to fit mathemati- 
cal expressions to data of this kind, they neglected to ac- 
count for the number of persons who produced no output. 
For our purposes, measurement of this class is of vital 
importance, because many persons work diligently with- 
out producing anything of note to be shown within a 


| 
| 


Taste 2. Contributions рег Author-ONR Human 
Engineering Bibliography 





No. of titles listed 


No. of authors mentioned 
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given period of time. This factor becomes of even more 
importanee the shorter the tme period to be measured. 


9 Тһе Estimating Procedure 


The question now to be answered is, how many 
produced no reports? With such an estimate, the number 
who produced reports plus those who did not, gives 
us the total number in the universe, and deriving such a 
parameter we can estimate the number of papers a given 
population of potential contributors is likely to produce. 

Fortunately Cohen (8) has provided an answer. Using 
his tables for estimating the mean of a Poisson distribution 
with the zero value absent, we are able to restate the 
distribution as shown in Table 3. 

We see that the use of the procedure has resulted in 
an estimate of over 2,000 noncontributors. Or, reversing 
our procedure we can say that had we started with & 
population of 4,353 specialists in Human Engineering, 
we could have estimated 2,255 papers as the total output 
of this group. 

Obviously the procedure is only as valid as the method 


Taste 3. ONR Human Engineering Bibliography- 
Contributions per Author — Zero Frequency Supplied 











No. of titles Authors % 
0 2,098 482 
1 1,718 39.5 
2 .932 - 7.6 
3 109 2.5 
4 48 11 
5 31 7 
6 10 2 
7 3 —* 
8 2 —* 
13 1 —* 
16 1 —* 
4353 100.0 
* Less than 19%. 
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œ of computing the zero fréquency. Equally also, such a 
‘procedure must have some error limits associated with. . 


it; Le. a range of values within which there is a ee 
probability а correct estimate lies.’ 


. ' The important parameters here nre the averages be- | 
cause essentially what the process does i$ to атепд, Ње’ 
average of the distribution calculated without: the zero’ 


frequency in such a way that a new average is calculated 


as though there were a zero frequency, In the instance. 


of the ONR data, the: average output per man is 141 


< articles. The calculated L or average of the Poisson dis- 


tribution which assumes there is a zero class, and which 
matches this mean of 1.41, is 7474. It devėlops also that 


the three standard deviation limits for this estimate are. 


7548 and 7405 which means that although the estimate 
of authors given in Table 3 is 4,353, an error of more than 
4% or 16 authors either way is extremely unlikely. That 
is the error of the estimate is very small. . 


22% The Time Effect 


A similar computation, performed on data contained `. 


in Communications of the Association for Computing 
` Machinery Author Index -1958-1961, produced Table 4. 
The significance of. the above table lies in the fact 
that the proportion of ‘the frequencies in the zero class 
is less than the proportion in the class having one title 
. each. In other words, the increased length of time, in 
this case 4 years, is apparently responsible for increases.in 
literary productivity. The Human Engineering Bibliog- 
` raphy covered one year's work; Ше ACM index covered 
. four year’s work. The proportion of authors having no 
`- . contribution in «нё first case was "about 48%; in the 
second 31%. . 
With this as а clue, a number of tabulations of T 
' indexes of learned periodicals and journals were made. 
- The tabulations covered all time periods, from biweekly 
indexes of active research aids such as Chemical Abstracts 
to the 50-year record of contributions to the Quarterly 
Journal of Economics. Тһе broad outline of system be- 
havior was first established, then an intensive study of 
annual productivity was undertaken. ; 


In ай, 8 biweekly, 2 semimonthly, 8 monthly, 4 bi- A 


monthly, 1 semiannual, and 8 annual references, plus: 5 


Тана 4, ACM Author Index: Contributions per Author 
' 1958-1961 — Zero Frequency. Supplied 








No. of titles No. of authors -Fo 
. 0 4 c5 312 
1 - 54 | 383 
2 27 191 
3 13 92 
а 2 - 14 
79 1 8 7 
{ 141 1000 
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references covering periods longer than annual, were 
used to establish the broad outline of the behavior of 


' productivity. over time. The intensive study of ainual 
productivity covered 29 references. A listing of perenne қс 


18 giveh i in Appendix A to this article: 
The technique amounted to counting the numbér of 
mentions opposite an author's name: over’ a sample 


, number ‘of pages of the index. Joint authorship of an 


article was counted as single authorship and subsequent ' 
mentións under joint authors were ignored. Since actually 
each ‘éuthor should have received credit for less than a 
complete report, the final results will tend to be over- 
stated because the percentage of contributors producing 
one report will be larger than it should be. To some 
extent, however, this may be ‘compensated for by the fact 


~ that: ‘joint authorship produces more ‘reports. That is, 


a team of four scientists may produce six reports under . 


joint authorship. This would be counted as‘ оде author 


producing six reports’ and the upper ranges of. the ..’ 


. distribution would then tend to be overstated. 


The sample sélection was strictly random, although 
there’ was a preference for-the use of the alphabet as а 
basis for selection. In other samples а. enue of pages’ 
was ‘selected at random.. 

Table 5 contains the result of taking apes of varying 
sizes from different abstracts and subjecting .them to the 
Cohen - ‘technique. Where the number differs from that 
referred to earlier it is because samples from different 


. time periods were combined for identieal references. 


The trend follows the intuitive belief established earlier, 
namely, the longer the period of time covered by the 
index, the greater the probability of a contribution to re- 


_ search, and the shorter the time period, the greater the 


likelihood of no contribution during the time period. 


Д 
уе И 


e Making the’ Most of a Contribution 


Since one of the aims S of this research was to дае 


> an ‘annual estimate: of literary’ productivity of. scientists | 


Тыз Б. Relationship of Per Gent Жабыша 4 to. 











Time Period Covered by Abstract 
SUED ptr Nonéontributing 
. Time period ; %. 
-~ 7 Biweekly (2) | ~ 90.9, 79.9 
, Semimonthly (1) 76.0 
"Monthlp(2: ^ — — 89.4, 80.6 
"Bimonthlp(3). - > 820,774,753. - 
^3! Sémiannual (1) 222 961 ; 
" 5 © 62,645,495, 482 
| Annual (8) 7 | ' 43.6, 40.6, 388, 363 
, Two years (1) - - , 482. 
; Four years (1) · о 912 С 
: : Five years (1) е Pr уу 
| | Fifteen years (1) 222-360 у 


са Fifty years (1) ~- 2-25 М, 84- 





Тавів 6. Per Cent in Zero Frequency Class, Annual Journal 
Indexes only 





Zero frequencies 
% No. of journals 





91-100 
81- 90 
71– 80 
61- 70 
51- 60 
41- 50 
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and engineers, journal literature indexes for one-year 
periods were studied intensively. Abstracts were not used 
at this time because of the possibility of duplication of 
articles. This simple precaution resulted in a rather 
interesting corollary to the main conclusion of the work, 
as we shall see shortly. 

Twenty journal indexes were examined and sampled, 
and frequency distributions obtained as before. The data 
weré then subjected to the Cohen procedure with the 
results shown in Table 6. 

By comparison, the data in Table 5 for abstracts only, 
publishing on an annual basis, gives a range of 56.2% 
to 88.8%, with an average for the zero class of 50%. 
But the average of the above frequency distribution is 
75.7%. The difference can be seen quite clearly in Table 7 
where specific fields are given. 


In the range covered by the five listed above, about 


76% of the authors will submit no papers for journal 
publication during a year. Those that do, are likely to 
prepare on the average of two articles on any annual 
contribution. 

These conclusions are arrived at by the following 
means. The average proportion of scientists contributing 
to journals is 24%, The average proportion contnbuting 
to abstracts is 50%. But abstracts cover all journals in 
their field. Then two mentions in the abstract, with only 
one mention in a particular journal would mean that a 
scientist making a research contribution is likely to pre- 
pare articles in more than one field, or more than one 
journal. 

When tabulating from each journal, a scientist MS 


a contribution is counted once. If a scientist has made | 


contributions to two journals, he is counted as having 


Taste 7. Per Cent in Zero Class for Specific Fields: 
Journal Indexes and Abstracts 





Per cent in Zero Class 








Subject Journal index Abstract 
Ceramics 89.7 54.5 
Chemistry 69.5 388 
Psychology 85.7 562 
Physies 6286 401 
Aeronautical Eng. 736 495 





| 
| 
i 


made one contribution for each medium. In the abstracts, 


however, he is counted twice; one for each article. 

The further conclusion we draw is that we must look 
to the journal indexes rather than the abstracts for our 
productivity factor. Instead of the approximate 50% 
zero productivity factor obtained from abstracts data, 
we must look at the 75.7% factor obtained from the 
review of the journals, if we wish to get а true measure 
of productivity. 


9 Preparing the Final Estimate 


, The actual data obtained from a review and tabula- 
tion of the annual author indexes of learned or technical 
journals amounts to samples drawn from the universe 
of all sueh journals in all time periods. The details are 
shown in Table 8. The frequency totals obtained are 
given in Table 9 with the zero class supplied by compu- 
tation. 

The expected proportions are those which would have 
been obtained had the Poisson distribution been used 
instead of the actual data. The agreement is very close 
with chi square computed at 1.62. Such a value could have 
arisen more than 99 times out of 100 due to chance 
alone. The actual and expected results are shown also 
in Fig. 1. 


© Some Conclusions 


If we apply the productivity schedule listed above to 
the National Science Foundation estimates of the num- 
bers of scientists and engineers in research and develop- 
ment in the United States, it should be possible to pro- 
vide an estimate of the amount of significant technical 
literature being produced annually. 

In April 1962, the NSF (National Science Founda- 
tion) reported the following numbers of scientists and 
engineers in research and development in the United 
States (9): 

1954 223,200 
1958 327,100 
1960 387,000 


By using several estimating methods which are essen- 
tially extrapolations from past growth rates, the fol- 
lowing forecast of the number of scientists and engineers 
in research and development (excluding supervisory posi- 
tions) ean be made (10): + 


January 1962 431,000 
Average 1963 435,000 
"d 1965 480,000 
? 1970 649,000 


А straight line fitted approximately to the three esti- 
mates for 1954, 1958, and 1960 is given in Fig. 2. The 


1 Based on National Science Foundation data. 
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: „а "ob mma average ot two different — Indexes." 


"for June of calendar year 1963, and could .well produce 


7ай, estimate ‘of 640,000 for-1970. It.would appear that 
" most estimates of the number of scientists and angers . 
ате based on linear growth rates. 


^ Due Anh. 420,000 at. January 1982, 470,000 


Now. clearly the number | of ‘Scientists and: EA 


assumed must ђе in.place at İeast one year in some as-. 


k · signment. or another in’ order. to be productive in their. 
line of work. We therefore assume that the number of ` 


М TABLE. 9. Савеза рег dts: 


0m i 


„Мо. óf'titles :. | 


“personnel in. place in July 1963 would be giving rise to’ 
EE publications in July.1964 — certainly not prior to that’ 
a? “date; and: most probably later than that' date. >x : 
25  Thére “are . certain, advantages also in working with ;. 

` estimates for the year 1963. First, the forecast of the 


number of- scientists and engineers has, leas opportunity 


аа of Annual 
Jj ошта]. Author mde - Zero ‘Frequency Supplied 
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‚ listed · mentioned Actüal . distribution 
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to suffer ‘effects of errors in extrapolation, and second, | 
we may possibly be in position to verify the accuracy of ` 
the final result. It is necessary to remember, of course, 
that the publications’ estimate. wil be for 1964. °°. 


‚ Working with an estimate of 465,000 potential ‘authors: 2% 


as of July 1963, {һе Probable annual podus is, 


calculated i in Table 10, 


Compared’ with the rather ine numbers that. have. . P | 


` béen offered by reputable and conscientious -researchers ' 


in the field, it. would .арреаг almost obligatory - “to 
apologize for. the small size of the estimate: · 


On’ the other hand, it does vreprésent am average of..: 


3:32 scientists and engineers per research paper, taking 
into account that .74.1% of all of them will produce. 
no papers during the year. By. comparison the average, 
number of. basic. research scientists and engineers per, 


~ - basic. research paper in. 1959 1 is porie by up National’ 


Taste 10. Gilet of TT бий of Scientists ‘and’ 


1 


Engineers in Research and Development during 1964 . 











Expected -Contributions. Total” 
proportion No.ofauthors’ регаџіћог articles. | 
922% 406,000 - 1 ` 103,230 | 
33 - 465,000 . .2 180890 2. . ~ 
3` -c 405000. :8 .A485 7. 
422. 465000 4 7 1860 —— 
139985 |” 
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сепсе Foundation as 24, a mot unexpected differ- 
ence (11). 


| 
9 Comparison with World-Wide Productivity 
Estimates 


If we consider the productivity ‘of scientists and 
engineers using the results obtained from sampling ab- 
stracts rather than the more conservative journal esti- 
mates, our productivity factor now becomes 1.46 scientists 
and engineers per paper written, a little more than twice 
the former estimate obtained.” 


* Based on the following schedule of productivity : 


0 40.5% 4 1.396 
1 38.2 5 5 

2 8.2 6 .1 

3 2.1 


Fc. 1. Per cent contributing and number of contributions per author annually — 
| | journal indexes. 


If this average productivity is valid, the 480,000 
scientists and engineers will produce 328,000 articles and 
reports in the United States in 1965. 

By comparison, Kent (12) forecasts a tota] world 
output of technieal literature in the same year of over 
900,000. 

When we compare these estimates, we see that the 
United States will produce over one third of the world's 
technical literature in 1965. Of this number the Depart- 
ment of Defense has financed possibly 174,000 and of 
these 75,000 will represent actual research results, rather 
than reworkings or reinterpretations. 


• Summing Up 


Briefly then, what has been shown is that despite 
frequent references in the press and elsewhere concerning 
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‘an exponential growth i in technical literature, there does 
"рој seem to be any evidence to this, effect. The growth . 


- ‘of the ‘numbers of scientists. and engineers‘seems to be 


‘linear, not exponential, according to reliable estimates. ' 
Secondly, analysis of actual productivity applied to; ‘these 


^' estimates, fails to indicate anything/like the volüme of 
Ж information other writers have predieted would occur. 
Of course the procedure does assume a constant average ee 


“productivity per investigator -based on journal output 


‚ У during the-périod: 1960—61. Should this average produc- 
` tivity be found. in later years to have risen, then. esti- . 


2 mates of annual output. would also have to increase. 


TAE ыз 


НЕ Certainly: a more -exhaustive investigation ‘of ‘the’ whole ` 
` "field is iidicated by these preliminary results: ~ 
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x E а: 2. Number of scientists and engineers in the United States in research and development. 
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APPENDIX A 


| Source or ЮАТА 
Biweekly 

Technical Publications Announcements, 27 September 
1062, 2(13), NASA, A-L inclusive, M-Z inclusive. 

Chem. Abstr., 6 August 1962, 57(3), American Chemical 
босісіу, А and В. 
Senimonthly 

rV ucl. Sci. Abstr., Personal Author Index; 31 August 1962; 
U. S. Atomic Energy Commission; A-C inclusive, Q-Z in- 
clusive. 


M onthly | 
Index Med.; April 1962; 3(4): National Library of Medi- 
cine, U. 8. Department of Health, Education, and Welfare; 
рр. А415-А495 inclusive, pp. A494-A502 inclusive. 
"Solid State Abstr., 1962, 3(2), Cambridge Communica- 
tions Corp., Abstracts No. 14347-14694. 


Bimonthiy 

Astronaut. Inform. Abstr., Reports and Open Literature, 
August 1962, 6(2), Abstracts 60, 308-60, 603, Jet Propul- 
sipn Laboratory, California Institute of Technology, Pasa- 
dena. Author Index Nos. 60,001-60,603. 

Psychol. Abstr., April 1962, 36(2), American Psychological 
Association, Inc., Washington, D. C. Author Index A-D 
inclusive. 2 

Psychol. Abstr, June 1962, 36(3). Author Index, A-D 
inclusive. 

Biol. Abstr., January-February 1961, 36, University of 
Pennsylvania. Author Index, А-В inclusive. 


Semiannually 

Nucl. Sct. Abstr., Semiannual Index, 30 June 1962. Janu- 
ary-June 1962, 16(12B), U. S. Atomic Energy Commission. 
I, J, and part of K. 


Aanual 
Phys. Abstr., Science Abstraets Section А, 63, Author 


‘Index Number, 1960, pp. 2101a-2108b, 227722812. 





, 1948, 52, A only. А 

Aeronaut. Eng. Indez, 1957. Institute of the Aeronautical 
Sciences, New York, 1959. 547 items from beginning of A. 
347 (all) P, and 546 from end. 

Psychol. Abstr. Annual Index Number, December 1952; 
26(12), A~Bion inclusive and L-Mayer-Gross. 

Chem. Abstr., Author Index, 1961, A-page 21, B-beginning 
page 73, pp. 253-261 inclusive. 

Analyt. Chem., 1961, 33, American Chemical Society, 
Washington. Author Index, pp. 1971-1980, A-D inclusive. 

Trans. ASME, January 1958, 80, containing Index to 
ASME transactions, 1057, 79, pp. SR 133-SR 145, A-S 
inclusive. 
, containing Index to Mechanical Engineering, 
January—December 1957, 79, pp. SR 105-SR 181. 

J. Am. Chem. Soc., 1961, 83. Author Index, pp. 5053-5108, 
А and B. 

J. Am. Ceramic Soc, 1961, 44, Columbus. Author Index, 
1 December 1961, 44(12). | 

Ceramic Abstr, 1 December 1961, American Ceramic 
Society. : | 

J. Acoust. Soc. Am., December 1960, 32(12). Author In- 
dex, 32, p. 1722, A-D inclusive. 

Bull. Am. Math. Soc., January-December 1961, 67. 

J. Chem. Phys. December 1961, 35(6), pp. 2274-2286. 
Author Index, 35, A-D inclusive. 

J. Phys. Chem., 1960, 64. Author Index, A-D inclusive. 

J. Scient. Instrum., 1961, 38. Index, A-L inclusive. 

J. Appl. Psychol., 1961, 45. Author Index, all. 

J. Appl. Phys, December 1961, 32(12). Author Index 
to Vol. 32, A-C inclusive. 

J. Opt. Soc. Am., December 1960, 50(12). Author Index, 
50. АП. 

J. Aerospace Sci., December 1961, 28(12). Index, 28, pp. 
1000-1007. Author Index, A-G inclusive. 

Phys. Rev, 15 December 1960, 120(6). Author Index, 
117—120, covering the year 1960, A-C inclusive. 

Psychol. Bull., 1960, 57. Table of Contents. 
, 1961, 58. Table of Contents. 
Psychometrica, 1959, 23. Index. 
, 1959, 24. Index. 
J. Consult. Psychol., 1959, 23. 
, 1960, 24. 
, 1961, 25. Table of Contents. 

Nucleonics, December 1960, 18(12), Author Index. Janu- 
ary-December 1960, 18, pp. 158-160. : 

Transactions — American Geophysical Unton, 1960, 41. 
National Academy of Sciences, National Research Council, 
Washington, D. C. 

Am. J. Math., 1951, 73, Johns Hopkins Press, Baltimore. 

Human Engineering Bibliography 1959-60. Report, Octo- 

















' ber 1957, Tufts University, Institute for Psychological Re- 


search, Human Engineering Information and Analysis Serv- 
ice, Report No. ACR-69, Office of Naval Research. 
Five Years 


Psychopharmaca, a bibliography of Psychopharmacology, 
1952-57. National Library of Medicine, Public Health Serv- 
ice, U. S. Department of Health, Education, and Welfare. 
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Public Health Service Publ. No. 581, Public Health Bibliog- 
raphy Series No. 19. Anne D. Caldwell, M. D., Ed. U. 5. 
Government Printing Office, Washington, 1958. А and B. 


. Fifteen Y ears 


Bibliography on Shock and Shock Excited Vibrations,: 
January 1958. I. N. Brermnn, Ed. Engineering Research 


16 
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Bulletin No. 69, College of Engineering and Architecture, ` 
Pennsylvania State University. Covers 1938-56. A-E in- 
clusive. 


Fifty Years 
Quart. J. Econ. Index 1886-1936. Harvard University 
Press, 1936. A-F inclusive. 


Indexing Problems and Some of Their Solutions 


This paper concerns problems of redundancy and in- ` 
‘accuracy іп the indexing of technical information, il- 
.lustrated by the coordinate indexing scheme em- 


ployed in the NASA Information System. À proposal 
is made for the elimination of "panacea" or ‘'catch- 
ай" terms, and a rule for uniform grammatical nega- 
tion is given. The effects of synonyms, antonyms, and 


9 Introduction 


'The use of the computer, although а boon to rapid 
location of information, has stunted the growth of index- 
ing techniques which promote optimum effectiveness of 
retrieval in & viable information system. The ease of 
information location contributes to redundant indexing 
by encouraging the assignment of many descriptors to a 
document. On the one hand, this facilitates the retrieval 
operation by minimizing the necessity of & rigorous pre- 
search hunt for the exact terms which characterize the 
search topic. On the other hand, it often creates the need 
for reanalysis after retrieval because of the large number 


' of irrelevant items printed out. Moreover, the cost of the 


time spent in screening the yield of documents for 
relevancy is an important factor, since in general, the 
searcher’s or user's time is much too valuable to be spent 
in this way. 

The principal requisites for ап information-retrieval 
system are predictability and relevancy of the retrieved 
information. Unpredictability fosters doubt as to the 
completeness of an information-retrieval task. А сопве- 
quence of this is more searches and an increase in output 
cost. If the literature searcher, in an effort toward com- 
pleteness, engages in prolific machine searching within a 
given problem area, һе wil undoubtedly be confronted 
with redundancy in the retrieved information. The 
strategy underlying the retrieval of information from a 
computer-centered system can be very difficult to formu- 
late. This is because of the multiple connotations of many 


negations on the overall efficiency of the information 
system are illustrated. Merits are discussed and rules 


. are given for indexing under acronyms whenever pos- 


sible. Finally, the concept of a "pictorial thesaurus" 
is proposed to exhibit hierarchy and connectivity of 
terms as an aid to indexing and retrieving of in- 
formation. 


LOUKAS LOUKOPOULOS 


Center for Application of Sciences and Technology 
Wayne State University . 
Detroit, Michigan 48902 


descriptors, and the inevitable human inconsistencies in 
distinguishing among a given set of eligible- terms under 
which a document may be indexed. The result is often an 
information system that is neither sufficiently accurate, 
nor sufficiently predictable. By careful procedures the 
experienced searchers can avoid many of the pitfalls of 
inaccurate or redundant indexing, but not without spend- 
ing extra and valuable time. 

This paper will consider only questions and problems 


‘on the indexing aspect of an information system, since 


indexing is the first step to accurate and useful informa- 
tion retrieval. To some extent this approach to indexing 
will be made from the standpoint of the grammatical 
structure and logical implications of descriptive terms. 
One reason for taking this approach is to obtain certain 
simple rules whereby the descriptor’s grammatical struc- 
ture can be used to differentiate among varieties that can 
be described by the term. 

For purposes of illustration, this paper will draw, upon 
the system model of the National Aeronautics and Space 
Administration (NASA) which services Wayne State Uni- 
versity’s Center for Application of Sciences and Tech- 
nology (CAST). It will concentrate on the machine term 
vocabulary aspect, with occasional digressions into search- 
ing techniques. The attempt is to consider only a few as- 
pects of the indexing operation, rather than to exhaust 
tbe whole topic. Hopefully, the results of this analysis will 
help in the design of simpler and more concise user- 
oriented information systems. 
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• The Information System 


Ап information system is comprised of а computer, a 
storage element, a machine term vocabulary (MTV), a 
searcher, and a user. The MTY is a collection of all terms 
under which the information mass has been indexed and 
stored. The searcher is the person who formulates the 
user's question, in a language governed by MTY, and 
presents it to the computer. The user is the person who 
generates the question and who will ultimately use the 
retrieved information to accomplish a specific task. 

Only the basic concepts of set theory are relevant to 
this discussion. Of particular use are the concepts of set, 
subset, the union (+), intersection (x), and complemen- 
tation (c) of sets. The set-theoretic approach is taken 
for two reasons. The first one is based on the computer’s 
ability to be programed to satisfy Boolean relations. The 
second reason is that each MTV term can be considered as 
a set whose elements ‘comprise the documents indexed 
under the term. 

In general, MTV terms can be divided into two 
mutually exclusive types. The simplest type of term is a 
single word such as AIR, HEAT, SOUND, work, etc. The 
other type of term is comprised of a series of words such 
AS BUBBLE CHAMBER, CROSSED FIELD AMPLIFIER, PULSE 
WDTH MODULATION, PLASMA ARC METAL SPRAYING, etc. 
For the present it will suffice to note that the second 
type of term is of the form AN, AAN, AAAN . . . where A 
and w stand for Adjective and Noun, respectively. 

The MTV is a necessary link between searcher and 
computer because only those terms found in MTV are 
meaningful to the computer. In addition to placing а 
limit on the number of computer-understood terms, MTV 
actually defines the information mass as & function of the 
totality of its entries. This defining of the information 
mass is rather narrow because the index terms reflect the 
terminology ofzthe author and to a lesser degree the 
terminology of the abstracter and indexer of each docu- 
ment. Oftentimes these terminologies do not coincide 
with the user's and effective communication among user, 
searcher, and computer, breaks down. For this reason, 
many reference works such as technical dictionaries and 
thesauri are essential to every presearch analysis. Some 
of the most useful references are Thesaurus of Engineer- 
ing Terms (EJC Thesaurus), Euratom Thesaurus, Space 
Age Dictionary, Van Nostrand's Scientific Encyclopedia, 
and Webster's New International. Dictionary. 

In all that follows, it shall be assumed that an abstract 
of à document has been given for indexing. It will be 
further assumed that the abstraet has been furnished 
by the author of the document, or generated by someone 
who is knowledgeable in the field about which the docu- 
ment relates. . 


Ф Panacea Terms 
'The foremost problem associated with the information 


_ System defined in this paper is the existence and rate of 
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growth of “panacea” or “catch-all” terms. With par- 
ticular reference to the NASA system, a panacea term 
is defined to be that MTV entry under which at least 


"1,500 abstracts have been indexed. More generally, 


panacea terms are defined to be those, whieh, through 
either their frequency of usage or broadness of scope, 
possess little or no definitive power. Examples of such 
terms are: AIR, ATMOSPHERE, DATA, HEAT, WORK, etc. 

Panacea terms are detrimental to the effective opera- 
tion of the information system for many reasons, the 
most important being unpredictability and irrelevancy 
of search results associated with their use. For example, 


under present indexing schemes, the term work has ab- · 


stracts indexed under it that deal with such diverse topics 
as "working sleigh dogs," “work hardening of metals,” 
“thermionic work functions,” and many more. A searcher 
wanting only information dealing with “thermionic work 
functions” would have two search alternatives. The first 
would be to search the computer via the intersection 
THERMIONIC X WORK X FUNCTION/S. The second would 
be to examine 3,000 abstracts indexed under the term 
work. Experience has shown that in the majority of cases, 
any abstract dealing with sequences such as “thermionic 
work functions” is not necessarily indexed under all three 
terms, and consequently would not be obtainable from the 
above intersection. By some means, however, abstracts 
containing sequences as those cited above most often are 
indexed under that term in the sequence that happens 
to be a panacea term (in this case work). After a few 
failures with sophisticated intersection-type searches, the 
searcher is forced to rely on computer “dumps” rather 
than on intersections and complementations. This practice 
increases the output cost and reduces searching capacity 
and efficiency. 

Analysis of more than 800 computer scarches indicates 
that the number of accessions listed under each of the 
250 panacea terms in the NASA МТУ is increasing at 
the rate of 130 each month. This means that, on the 
average, every panacea term increases by 1,500 accessions 
per year, thereby losing its usefulness as a search term. 
The main reason for this phenomenon is-a lack of pre- 
indexing term analysis. This means that present indexing 
criteria oblige the indexer to index under those terms 
appearing ‘in the title and body of the abstract rather 
than terms describing the content of the abstract. Ex- 
amples of this may be seen by considering computer 
searches using the following three terms: (1) row 
TEMPERATURE ENVIRONMENT, (2) NEGATIVE RESISTANCE 
DEVICE, and (3) MODULATION INDUCING RETRODIRECTIVE 
OPTICAL 8YSTEM. The abstracts obtained by using Term 1 
showed that mrrecrs rather than ENVIRONMENT would 
have more accurately indicated the subject matter. Only 
in a minority of the abstracts did the sequence “low 


. temperature environment" appear in toto. In contrast, 


the majority of the abstracts contained only the sequence 


“low temperature” with the word “environment” appear-. 


ing elsewhere in the body of the abstracts. It seems that 


the indexer, having a priori knowledge of the existing 
МТУ term Low TEMPERATURE ENVIRONMENT used this 
term as a further basis for indexing rather than differ- 
entiating between words modified by the sequence "low 
iemperature." r 

The results obtained by using Term 2, NEGATIVE RE- 
SISTANCE DEVICE, indicate once more the indexer's tend- 
ency to ignore the content of the abstract and concentrate 
on previously established M'TV term meanings. The 
use of the word "device" is both misleading and re- 


‚ dundant as the majority of the abstracts did not deal with 


а, device except in the extended sense of the word. Тће 
indexer considers a transistor as a "transistor device," 
а tunnel diode as a "tunnel diode device," ad infinitum. 
In this instance, indexing the abstracts under the term 
NEGATIVE RESISTANCE would have sufficed since this term 
completely characterized them. 

The output due to Term 3 again proves that the indexer 
is unaffected by the content of the abstracts. More im- 
portant, it shows that the terminology of a very small 
number of abstracts can be taken too seriously by the 
indexers. Ás a result, abstracts dealing with identical 
subject matter expressed in a slightly different terminol- 
ogy will not be indexed under the same MTV term. Six 
of the total of 11 abstracts that were indexed under Term 


' 3 used the term either as their title or as part of their 


----- 


title. Of the remaining 5, one was completely misindexed, 
and the remaining 4 simply referred to the acronym 
MIROS which stands for Term 3. Since MIROS is a 
NASA-sponsored research project, the 10 abstracts re- 
flected a particular terminology as well as a research area. 
The fact that no “open” literature (e.g., published articles 
in journals, books, ete.) was indexed under Term 3 can 
either mean that only NASA is interested in this area or 
that independent research along similar lines lags by at 
least three years. Both of these alternatives are unten- 
able. Hence, it seems reasonable to assume that the 
absence of open literature on this subject, under this term, 


. is due to difference in terminology. 


The first step toward the elimination of panacea terms 
is to create indexer awareness of the problems they can 
create, some of which have been discussed above. The 
fundamental rule for eliminating panacea terms is that 
MT' entries should convey the content rather than the 
terminology of any abstract or collection of abstracts. In 
addition, indexing under single words whose meaning is a 
function of words they modify or words that modify them 
should not be allowed. Descriptive terms to be entered 
into MTV need not and should not necessarily be taken 
from the title of the abstract to be indexed. If a term 
allows a broad interpretation, it should modify or be modi- 
fied by one or more terms in such a way that its ambiguity 
or broadness is reduced. If this is not feasible, the term, in 
spite of its appearance in an abstract, should not be en- 
tered into MTV. Every effort should be made to gear the 
content of an abstract to descriptors which are a part of 
the user’s terminology, especially when multiple-modified 


terms are involved. Discretion should be used in picking 
word sequences or clusters from abstracts and entering 
them-into MTV without first considering the user's ter- 
minology. For example, an abstract should not be in- 
dexed under Intermolecular Bonding simply because the 
cluster appears in it when it could be indexed under the 
equivalent and more popular term PRESSURE WELDING. 

However, care should also be taken to avoid excessive 
modification of panacea terms. For example, consider the 
terms CROSSED FIELD AMPLIFIER, PULSE WIDTH MODULA- 
TION, and PLASMA ARC METAL SPRAYING. The first two 
terms are of the form AAN, while the third term has three 
adjectives—AAAN—modifying the noun. In essence, each 
successive adjective modification of a term represents the 
intersection of that adjective with the remaining adjec- 
tives and the noun; ie. AN means AXN, AA’N means 
АхА ХМ, and AA'A"N means Ax A'x A” XN. Thus, 
the original set АМ shrinks (becomes more descriptive) 
as the number of adjectives increases. The indexer’s ability 
to recognize that descriptive terms are useful, only to the 
degree that they do not limjt meaning by overmodifica- 
tion, is essential to the proposed method for eliminating 
panacea terms. For example, the sets PLASMA METAL 
SPRAYING OF PLASMA SPRAYING are no less descriptive than 
their mutual subset PLASMA ARC METAL SPRAYING, but are 
less restrictive. 

The above proposed techniques, while helpful to the 
system, are, in reality, a trend toward increasing the num- 
ber of generic levels for indexing. To a large degree, eco- 
nomics favors this trend provided it is carried out with 
discretion. If this is not the case, both input and output 
costs will rise. Implementation of this trend for achieving 
functional and economic success is dependent on feedback 
from the system’s users, The following statement by Dr. 
Mortimer Taube is apropos to this discussion: 


One of the things we have noticed in designing systems 
is that one can post or index an item on a number of 
different generic levels. Indexing on a number of differ- 
ent generic levels in principle increases the input cost 
and reduces the output cost. A saving can be made on 
the input side by indexing on a single level and by mak- 
ing logical sums to provide answers to general questions 
on the output end. This, of course, is an expensive way 
to search and if your clients ask you for many questions 
which involve making logical sums, the cost of searching 
will go up. On the other hand, if you attempt to antici- 
pate what your clients-are going to want and post, that 
18, index, on all the generic levels you can think of, you 
will increase your input costs immeasurably. 


• Synonyms 


Whenever a literature searcher makes a computer 
search on a specific topic, he uses those MTV terms which, 
as & function of his knowledge.and reference sources, 
characterize the topic. If the search results prove inade- 
quate he has no recourse but to make additional searches 


? See B. E. Holm in Bibliography. ый 
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: using -differesit, but related MTV terms. Very often,, the 
`- searcher’s lack of knowledge of those MTV terms which . 


“ате synonymous creates the need for additional ‘searching: 


The NASA system is constructed in such а way that, if. 


~ ‘each of two synonymous terms is placed. on a separate 


search request sheet, the ‘computer will consider them in-. 


dependently, Therefore, апу, abstract that bas been 


» J indexed under both of the, synonymous terms will appear 


E к А 1. 
52.28. -He may-index, whether a choice exists or not, under 


.in both search yields. Consequently the.searcher or user 
‘is faced with the same abstract when he Screens the 
search yields: ^ -- E 


„Тће other major problem: ЕЕГ with the use ot E 


^ , synonyras can best be illustrated Љу the following argu- 


ee Let A and B be any two synonymous terms appear- 
ing in the MTV. Let X,, X,, Х, Xa, Xs, Xe, X, be seven ^. 
a Ита such that the following are true: Abstracts X,,. ` 
` Xa, Хе X, each contain both A and B; X, contains A but. ` 
7 not В; and X,, X, each contain B but not А. Then the .. 
:indexer is free to choose one or more of the following · 
| , alternatives for indexing these seven abstracts: 


1. Не may index according to the кишу of the ·. 


abstract. із means that. X,X 2) X;, Xo X, 
ава Х,, X, X, X, X, X, would certainly be 
E T under’ A and B, respectively. 


Тър 


. '2. He may index the abstracts’ ‘using only the terms А` 


‘and В, but should a choice exist, he.is free to 


^ exercise his preference of one term over үк : 


5. means that each or all of Abstrac 
, X, can be pues under both А ed W 
only, or-under A onl 


„A; under B, or under both, simply because he 


knows ‘a priori that.A and B are synonyms but 
prefers ‘one to the other. This means that he 


appear in X,. 


' Suppose that the ен has been consistent with the - ; 
alternatives referred to above. If the searcher dumps 
` Term B, he will lose X, which ‘contains only A. If'he | 


dumps Term A, he will most likely loge.K, and X, since 


. Бе faced with the same abstract/s in both search results, 

. In either case, the searcher'loses some abstracts or is 
‘faced with redundancy. Sincé these problems are inti- 

B ‘mately connected with those discussed іп the next two 


sections, suggestions for шек solutions will be postponed 


| until then. . 


Е e Hierarchy. ; 


;The indexing of НЕ js intimately connected 
with the ordering of terms as well as with their meaning. 


"ву far the most definitively , elusive and difficult of all the 

"information indexing concepts is that of term hierarchy. 

* Closely related to hierarchy is the connectivity of terms. 

| Ini. general, hierarchy implies connectivity but not the 
converse. Ап easily digestible definition of term. hierarchy - 


> 


8 jene ролын Copie Rs 


:-may index X, under B even though B ira not · 


. sarily be a subset of the one immediately above it anda |: 
‚ superset of the one immediately below it. ‘The mother set ' 


Жал ыы а. raid 


мит 


is the-graded ‘ordering. of terms in 4 vertical manner, such 
that the term with the broadest spectrum occupies the- 


. uppermost position. On the other hand, term connectivity, i а 


in general, possesses а, nonvertical character. In this paper, : а 


' term _hierarchy,-in addition to its vertical ‘character, will ~ ~ 
be considered as a chain of gets. By this is meant that each. ? E 


term, except the, uppermost.or mother term, shall neces- , 


will be a superset of every set in the chain. In the case of. 


term connectivity, no such inclusive order will, in general, 


be assumed. 
Before plunging ito examples of some: common mm 


being worth а thousand words has real meaning for the. 


of terms are more easily read, assimilated, and retained. 


Linear graphs (straight lines) are the simplest form of E 


pictorial connectives. Linear graphs are simple to con- 
struct, easy 10 follow, and possess dynamic form: On the 


other hand, lists of terms such as those found in technical | 
^ thesauri are dull, rambling, and static and defy any at- ` 


tempt for retention. In addition to those shortcomings, 


lists of terms do not convey the-hierarchy of a 2 
: family beyond the first order of magnitude. | 


: of hierarchy and corinectivity, the method of representing, - 
‘them: will be discussed. The old cliché about a picture ` 


systematic indexing of technical literature. Pictorial or, ` 
‚ schematic representations of hierarchy and connectivity -` 


The requirement if it exists, that an indexer must боп. РА 


` sult а. technical thesaurus for indexing an abstract on a 


subject about which he has dubious knowledge, i is proper ME 
but expensive. The consultation of a reference such ag the, i 


EJC Thesaurus whenever there exists doubt regarding the: % 
` use or nonuke of a term is time consuming as well: To fur- 

_ ther require that'a 200-word abstract be indexed properly 
in less than 15 minutes ünder the above conditions i is un-' 
" realistic. There are, however, other ways which’ may in- 2 
+ crease the accuracy and decrease the labor of the: indexihg 
E . task. Through the use of.linear graphs, relationships and. 
. hey contain B only. In either case some abstracts will be ~ 

“Jost. If he dumps Term A and B separately, he will then 


order regarding terms which represent varieties about 8 
certain interest area, may be detailed more effortlessly, 


forcibly, and quickly. It is envisioned that individual fig-, > `~. 


‘ures similar to those found in this paper’ may become .' 
©. much more fashionable than word lists. These figures, 
‘reproduced either on desk-size flip charts ог ой microfilm | | т” 


cards with ап accompanying selector-readér, would-es- ' : | 
sentially be а pictorial thesaurus. With such a thesaurus, ^ - `` 


2 


ambiguity and facilitate understanding. 


The author at this point must acknowledge a major 
‘debt to the aüthors of the Euratom Thesaurus. Though - 


`. a flip of the band ог ће push of a button would diminish i Py 3 


Ac e 


limited to the display of term connectivity, this work ~~ 


| ‚ Served as impetus for the enlargement of the scope of pic- 


torial representations to include term hierarchy ma в. 
consequences. 


` The terms JOINING, RADAR, and LABER have been сааб Sear: 


' - to demonstrate the concepts of hierarchy and connectivity '. 
` as they should be used in creating a pictorial thesaurus. 


oy 


MEER ATE TEM secus c 


JOINING 


GLUING 
ДАт+---4 
А i 
9" i 
CEMERTING | 
мазм" ! 
~ СЕМЕМТ-<- – У 
а 
iu vus R 
ATOMIC 
{antonym} 
ОВСАКІС---.-.-.-.-. INORGANIC 


| : ШЕ и ЧОНИ SYNTHETIC 


| ‘ 
Figs. 1 and 2 display the hierarchy of the term JOINING 
as defined by Van Nostrand's Scientific Encyclopedia. То- 
gether they represent no less than 2,988 words. Fig. 3 
shows the hierarchy of the terms RADAR and LAsER. Im- 
plicit in this attempt to show the relationship between 
terms is the intention of creating an awareness of different 
types of hierarchy such as those based on method, type, 
a function, and adjective. The “function 


Fig. 1. Hierarchy of the term JorwiNG. (Synonyms designated by dashes 
| [- ~ ~]; antonyms designated by dash, dot [- » — e].) 


JOINING and WELDING, between SPOT and PROJECTION, and 
the double arrows between OXYGEN-HYDROGEN and ALU- 
MINUM and HELIARC and MAGNESIUM. The “adjective 
hierarehy" is the successive modification of а term by 
adjectives. As an illustration,.let RADAR be the mother 
term. Then TRACKING RADAR, RANGE TRACKING RADAR, 
RANGE TRACKING FIRE CONTROL RADAR, etc., constitute an 
adjective hierarchy. To further clarify the concept of 








| rehy" is best shown by the horizontal arrows between function hierarchy, let f be the function called FORGING. 
| 
JOINING E 

FUSION 
! INTER-DIFFUSION 

FORGE WELDING (High Temperature, 

| WELDING High Pressure) 
i M do 4 К 
GAS ARC RESISTANCE PRESSURE ULTRASONIC ELECTRON LASER 
d.c. (Metal or alloy / - BEAM 








electric arc) 
OXYGEN- 
ACETELYNE 


SUBMERGED ARC|CARBON ARC 
{Automatic 


Machines) Machines). 


SHIELDED ARC 
* (Arc shielded by 
ап inert gas) 


(Heat generated by heated by its 
own resistànce) 










Similar methods 
but not synonymous 


SPOT 


Mass Production modification 
PROJECTION 


*Definition of BRAZING: Any method of raising 


temperaturez greater than-1000' P but not to 


the melting point of the metals or alloys 
involved. 


BRAZING (Hard Soldering) 


SILVER COPPER BRONZE 
BRAZING  BRAZING BRAZING 
SOLDERING 


Fig. 2. Hierarchy of the lerm Jo1N1NG. 
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RADAR 





KAPP ING 


TRACKING & MAPPING 


5 H & MAPPING 





LIDAR 
LASER = LASER RADAR 
| USES COLIDAR 
санте б = HINING 
| А WELDING 
GAS | SOLiD LIQUID 
L m MEDICINE 
abe – + – | - 0 Material 
argon Ruby Torbium TRONOMY 
Noon : Я 3 


Fia. 3. Hierarchy of the terms RADAR and LASER. 


Then the process of forging transforms an ingot into a 
billet or graphically : 


METALWORKING 


FORGING , 
CASTING 
| { 
| INGOT — — BILLET 
From this graph it can be seen that a searcher, who was 
unfamiliar with this topie could obtain all information 
concerning the forging of billets by the intersection FORG- 


ING X BILLET rather than dumping FORGING or METAI- 
WORKING OF CASTING. 


9 The Indexer's Six Infernos 


The six infernos into which an indexer is likely to fall 
owe their existence to: ignorance, indecision, word 
frequency, terminology of the abstract, expediency, and 
too much or too little freedom for making terms eligible 
for MTV entry. 

Ignorance is used to mean the indexer's lack of knowl- 
edge of a particular field about which an abstract re- 
lates. It is responsible for practically every indexing 
problem and is particularly responsible for the growth 
of panacea terms. The obvious solution is to employ 
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- indexers having broad scientific nnd technical baek- 


grounds. Another solution to this problem might be to 


_ employ an indexer with broad experience to work closely 


with at most three less-experienced personnel. In this way, 


‘any questions regarding the indexing of an abstract could 


be answered immediately by the senior indexer, thus 
reducing the “ambiguity time" and inerensing the ac- 
curacy. 

The existence of too many choices of terms under 
which a given abstract may be indexed 18 in part re- 
sponsible for indecision on the part of the indexer, In- 
decision is the prime cause of overindexing and can be 
minimized by discreet selection of MTV terms. 

Word frequency is the practice of indexing an abstract 
under a term on the basis of its frequency of appearance 
in a given abstract. It is responsible for inaccurate and 
redundant indexing. While frequency of appearance is 
а necessary condition for indexing under a term, it is 
not a sufficient condition, An equally important con- 
sideration is the grammatical usage of the term in a given 
abstract; i.e., its usage as a noun, adjective, or verb. The 


indexer ean seek to resolve this problem by taking into 


account the grammatieal usage that а term implies. For 
example, a given abstract concerned with, radar ranging 
methods may have "range," “ranging,” "range radar,” 
“to range,” “ranging radar,” and “radar ranging” in its 
title or body. To index this abstract under any term 
other than RADAR RANGING METHODS ОГ RADAR RANGING 
would produce redundant indexing. 

Terminology of the abstract means that whenever the 
indexer is confronted with variations in terminology of 
essentially similar concepts in different abstracts, he docs 
not standardize the descriptive terms before proceeding 
to index these abstracts under appropriate MTV terms. 
This serious oversight is usually due to Ignorance and 
Indecision. A term’s existence in 10 or 15 or even 50 
abstracts should not be a prime eriterion for MTV 
candidacy. А term’s logical inexactitude should be con- 
sidered prior to its entry into MTY. A term like SHORTEN- 
ING is but one of many terms that must be judiciously 
inserted into MTV. For instanee, it can mean “cooking 
oil," “Doppler shortening of electromagnetic waves," or 
even relativistic “shortening of swords” in the-sense of 
FitzGerald. This problem of terminology will be rectified 
when the problems of ignorance and indecision have 
been resolved. 

Expediency related to ignorance means that the in- 
dexer, in an effort to keep up some quota of indexing » 
abstracts per hour, tends to index certain abstracts in 
fields about which he has dubious knowledge under terms 
that do not reflect the content of the abstract but which 
have been previously established as MTV terms. 

The following discussion will illustrate some of the pit- 
falls inherent in the six infernos referred to above. The 
first concerns the usage of too many terms of the same 
family as MTV entries. А representative family that 
exists in the NASA system is: CODE, CODER, CODING, 


DECODER, DECODING, ENCODEH, ENCODING. Since informa- 
tion has to first be coded before it сап be encoded or 
decoded, the term cope emerges as the mother term. 

If no hierarchy is defined for this family, then a litera- 
ture searcher will, if he neglects to use all of these terms, 
lose.some abstracts. This incompleteness in searching 


could, quite conceivably, mean that the search was use-- 


less. In addition to the loss of possibly pertinent abstracts, 
too many, terms of the sume family in MTV raise both 
the input and the output cost. As an illustration of the 
effect on output cost, consider the common accessions of 
the term: cope with each of the six terms of the family 
shown in Table 1. 

To obtain the data for the table, single-term dumps 
were performed ising each.term. Next, the common ac- 
cessions between- cach term and the largest term (cope) 
were found manually. The first column of the table shows 
the logie; code which could have been used to obtain the 
common accessions with the computer. The second column 
shows the percentage of common accessions of cach tetm 
with CODE, relative to the number of accessions of each 
term. Far example, the first row of the table means that 
out of о] abstracts indexed under сорен, 12 of them-were 
also indexed under cope. 

Fig. 4 is a schematic representation of all mutually 
common accessions of the six terms. 

Iu eontrast to using too many terms (of a given 
family), too few terms will produce the adverse effect of 
"foreing the indexing," thereby destroying the uniqueness 
of a descriptor. The two extremes conjure up a "law of 
spite” which may ђе partially abolished only by con- 
sidering the logical connectivity of terms of a given 
family.:For the family in question, the four terms CODE, 
CODER, DECODER, and ENCODER would amply suffice. Their 
hierarchy may be schematized as: 


CODE 
CODER 


DECODER ENCODER 


The second concerns the lack of rules for indexing under 
one or more terms having the same grammatical root. 
It is felt that distinctions should be made regarding the 
usage of -tng, -or, -er, -ton terms in MTV. Failure to 
distinguish between terms having these endings will result 
in an abstract being indexed haphazardly under some or 
all of the terms having the same grammatical root. 

Some rules may be evolved by considering the premise 
that any thought, idea, or concept can be expressed in 


' 3 TABLE 1. 

.CODE х CODER 12/21 571% 
'CODE x CODING 12/57 — 21196 
‘CODE x DECODER 5/20 = 25% 
"CODE x DECODING 30/75 = 40% 
‘CODE x ENCODER 4/20 — 20% 
: CODE x ENCODING 23/72 — 32% 
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A64-17557 
А64-18218 


А64-27447 
К65-13432 





A64-19052 

N64-33540 
N64-18356 
X65-10854 
N65-11116 







л63-10923 
64-12196 
E62-12494 
н62-12879 
8962-13779 










64-17379 
63-10444 
H62-12494 







N62-14975 
N62-16544 
N52-17840 
N63-11255 
N64-12026 
E64-26975 
565-12265 






CODING 













Х62-12494 
N62-12869 


А64-27014 


А64-17441 
A65-11449 


9 ENCODER 


Fic. 4. A family of MTV terms and their common ассев- 
sion numbers. (“А” numbers refer to articles whose abstracts 
are found in the Jndernational Aerospace Abstracts [IAA]: 
“М” numbers refer to articles whose abstracts are found in 
the Scientific ана Technical Aerospace. Reports [STARI.) 


verb form. Consider the four tupple: DETECT, DETECTING; 
DETECTOR, DETECTION. The term DETECTING represents the 
verb form and hence the ‘idea or concept, DETECTOR, 
being the necessary “tool” (machine, device . . .) for 
implementing the idea, and pETECTION representing the 
terminus of the chain. In symbolic logic form we have: 
DETECT 
DETECTING | ) 
where the grammatical suffixes and the implication signs 
determine the gencalogy. 





DETECTION | 








<= DETECTOR 


• Negations and Antonyms 


The NASA system allows searches to be formulated by 
using negation or complementation of terms, ie., given 
any two MTV terms A and B, the information bank may 
be searched by either Ax Be or Bx Аг. Complementation 
шау be used to advantage whenever it.is known a priori 
what terms are to be negated. For example, suppose a 
searcher required information on the subject of noise 
excluding radio noise, vibration noise, and stellar noise. He 
could formulate his search request by Ах (B+C+D)¢ or 
Ах Bex Сех De, where A, B, C, and D stand for мозе, 
RADIO, VIBRATION, and STELLAR, respectively. 

The computer's ability to process negation of terms is 
essential to the overall flexibility ата conciseness of the 
system. This ability; however, can be endowed only by the 
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indexer. Negation of terms in English can be aceomplished 
by prefixes such as 4, im, in, not, non, and the suffix less. 
With the exception of those terms which сап only be 
negated by the suffix less, many terms can be negated by 
more than one prefix. Negation by one or more of the 
tripples (un, not, non; in, not, non) is an example. 
The absence of a uniform negation rule and the lack of 
correlation between synonyms, antonyms, and negations 
of МТУ terms ате a source of difficulty in working with 
the NASA system. In essence, the problem is that there 
exists no way by which a searcher, having found term “X” 
in МТУ, can quickly and economically find all its 
synonyms or antonyms that appear in МТУ. Should the 
searcher consult any number of thesauri for the synonyms 
or antonyms of a term "X," he has no guarantee that 
the author, abstracter, and indexer used the same sources. 
Tn cases where the assumption has been made that the 
same sources, (eg, ЕЈС Thesaurus, Van Nostrand’s 
Scientific Encyclopedia, etc.) had been used, the search 
efforts proved inadequate. In addition, the trend toward 
single, double, and triple modification of terms can make 
the problem less tractable. For example, consider the 
terms NOISE ELIMINATOR, WHITE NOISE ELIMINATOR, and 
RANDOM NOISE REDUCTION. If the searcher were interested 
in the subject of noise reduction and this term did not 
appear per se in MTV, there would be no way of finding 
the above three terms because of the adjective camou- 
flage. The only alternative would be to dump the term 
NOISE and screen 2,500 abstracts. This is a very expensive 
practice. The point is that indexing on different generic 
levels without a key to these levels is no better than 
indexing under a single level. A method which may over- 
come this difficulty would involve picking a term, creat- 
ing its hierarchy, and making a list of all its synonyms 
and antonyms as they appear in MTY. This task is not 
as formidable as it might seem because, with the possible 
‘exception of panacea. terms, these lists would be quite 
small. | 
Since every abstract reflects its author's or abstracter’s 
preference of term negation, the indexer is forced, due 
to lack of a “negation rule,” to follow suit. The' result 
of this conforming will be the existence of families of MTV 
terms such 8s: UNSTABLE, INSTABLE, INSTABILITY, NON- 
STABLE, NONSTABILITY, NONSTABILIZED, and INFLAMMA- 
BLE, NONFLAMMABLE, and NONINFLAMMABLE. An ex- 
tremely simple way of ridding MTV of multiple-negation 
forms is to employ a rule for grammatical negation. Such 
a rule might read: If any term is important enough to be 
inserted into МТУ, and this term appears in any of the 


forms il, im, in, not, non, its negation is defined as non, ' 


plus the term. The case of less shall not be affected by 
this rule. Examples of this rule’s duality are flammable, 
nonflammable; stable, nonstable; -metal, nonmetal, etc. 


At this point the two problems associated with ' 


synonyms will be discussed in terms of hierarehy, anto- 
nyms, and negations. The first problem concerns the 
possible appearance of one or more abstracts in the search 


- 24 American Documentation — January 1966 


yields of two or more synonymous terms whenever each 
term is placed on a separate search sheet. The second 
problem concerns, the loss of those abstracts that were 
not indexed under both synonyms А and B. Essentially, 
both problems have а. common solution. First, д hierarchy 
should be created where А, B, and all their synonyms are 
connected. Next, the synonyms should be ordered by 
"get inclusion"; ie., if А, B, and C are three synonyms 
of graduated deseriptive power, where A is the least 


.deseriptive and C the most descriptive, they will be 


ordered in such a way that C will be a subset of B and B 
a subset of A. Under this plan, the chance of losing any 
abstracts would be reduced by 50% since losses could 
only occur if the search were conducted with Term B. 

Further, panacca terms such as Norse should be con- 
sidered the least descriptive in a family of terms which 
includes all adjective modifications as well as synonyms. 
Consequently all negations and antonyms should be con- 
sidered relative to the term NonNoIsE. Hence the terms 
QUIET, SILENT, SILENCE, NOISELESS, etc., shall be con- 
sidered subsets of NoNNOisE. This rule will facilitate 
searches involving the use of complementation. 


Ф Acronyms 


The growth rate of acronyms in technical literature 
invokes the postulation of rules for indexing abstracts 
containing them. | ў 

“Тһе second edition of the Space Age Dictionary (SAD) 
contains approximately 140 acronyms and at least 200 
symbols. It is estimated that 95% of the SAD entries 
appear in the technical literature contained in the NASA 
system. The majority of the symbols represent names of 
organizations or research projects rather than technical 
terms. Acronyms, however, do represent and are capable 
of contributing many technical terms to MTV. For in- 
stance, the acronyms RADAR, SONAR, LASER, 
MASER, and DORAN represent a set of at least 20 
single technical terms. The panacea terms are denoted 
by an asterisk: RADIO,* DETECTOR, DETECTION,* DETECTING, 
RANGE,* RANGING, SOUND," SOUNDER, SOUNDING, NAVIGA- 
TION,* NAVIGATING, LIGHT,* MICROWAVE," AMPLIFIER,* 
AMPLIFICATION, STIMULATED, STIMULATION, EMIBSION," 
RADIATION,* and DOPPLER. | 

In addition, combinations of these terms may be 
formed, thus swelling the vocabulary without approach- 
ing in any way the descriptive power of the appropriate 
acronym. Examples of some combinations are RADIO 
RANGING, BOUND NAVIGATION, LIGHT RANGING, LIGHT 
AMPLIFIER, STIMULATED EMISSION, etc. It is obvious that, 
at this rate of combining and modifying terms, it will take 
upwards to 300 redundant or semiredundant terms to 
express what five acronyms can accomplish more con- 
cisely. 

Without exception, acronyms are used as nouns or as 
adjectives. This fact leads to four possible cases for their 


appearanee or for the appearance of their constituent 
terms in any abstract. 


1. The acronym, modified by or modifying another 
term, appears in the title or in the body of the ab- 
stract. 

2. The acronym, modified by or modifying another 
term, appears only 3n the body of the abstract. 

3. acronym, modified by or modifying or coexist- 
Mn some or all of its constituent terms, ap- 

pur in the title or in the body of the abstract. 

me or all of the terms which form а particular 
acronym which does not appear in either the title or 
in the body of the abstract, appear in the title or in 
the body. 


With regard to these four cases, the following indexing 
Tules are given: 


l. If, in a given abstract, the acronym is used as a 
noyn, the abstract should be indexed only under 
the defining acronym. If no such acronym appears 
in the machine term тнл; it should be made 


to appear. 


һо 


any combination of these, the abstract should al- 
ways be indexed under the defining acronym. In 
certain eases the abstract may be indexed under a£ 
most, one term other than the acronym involved, 
provided that the other term reflects the central 
theme or main idea of the entire abstract and not 
simply the title. 

3. If the acronym in either the title or the body of 
an abstraet is modified by another acronym or by a 
term which itself describes either a method or a 
concept, then the abstract should be indexed under 
both acronyms or under the “noun-acronym” and 
under a single additional term best describing the 
method or concept involved. An example of one 

agronyr modifying another is LASER RADAR. 

In this case, the single additional term could be 
COLIDAR. An example of a method or concept 
“heronym modifier” is poppLER RADAR. 

4. An abstract containing an acronym must not be 
indexed under any or all of the constituent terms 


Under the premises set forth in Cases 1, 2, 3, or | 


or their synonyms. For example, this rule forbids 
the indexing of an abstract containing the acronym 
RADAR under at least the following terms: RADIO, 
DETECTOR, DETECTION, RANGE, RANGING, all syno- 
nyms of these terms and all combinations of these 
terms. : | 

5. If “acronym” is replaced by "symbol," the above 
rules still apply. 


M 


• Conclusion 


This paper has demonstrated some of the more com- 
mon indexing problems associated with any information 
System using a machine term vocabulary. Suggested 
methods for improving the system and providing feasible 
solutions to the stated problems have also been ad- 
vanced. Though primitive, the partial solutions offered 
in this paper can upon further research, be refined and 
hence contribute toward the improvement of information 
systems. 
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Тһе User's Place in an Information System' 


9 Introduction 


In our paper at the Airlie House Symposium (1) my 
colleague, Professor Paisley, and I characterized informa- 
tion-retrieval systems as receiver-controlled communica- 
tion systems. We argued that information systems should 
be designed to maximize the amount of control by the 
receiver, and that, in general, the system should adapt to 
the receiver or user, rather than the user to the system. 

We argued that the information needs and the in- 
formation-seeking and information-processing behavior 
of scientists should be the subject of considerable psycho- 
logical research. Such psychological research is required 
to provide adequate specifications for system designers 
and adequate criteria for the evaluation. of systems. 


€ Future Systems 


If we ignore all the constraints imposed by the limita- 
tions of existing technology, we сап postulate а future 
information system with deep and flexible indexing, 
tailored to specific search requests. It might permit full 
text computer scanning of large numbers of documents 
according to criteria set by the user for that particular 
search. Let me -illustrate the need for such flexible sys- 
tems with a borrowed example (2). There are scientists 
who have the habit of piling up books and papers on their 
desks in a seemingly random fashion, yet know all the 
time how to find any given item. Should an assistant or a 
secretary bring apparent order to the desk, then the 
poor scientist may be unable to find anything. What is 
order to one person may be disorder to another, and 
vice versa. 

' The business of science itself can be said to be the 
creation or discovery of novel ways of viewing the en- 
vironment such that hitherto unobserved forms of order 
come to light. This being so, it is not surprising that the 
categories used by a scientist in his current research 
often do not coincide with the categories of an inflexible 


1 Presented at the ADI Symposium on Education for Information Sci- 
ence, Washington, D. O., October 10, 1965. 
2 Associate Professor of Communication, Stanford University. 
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index. Perhaps in some utopian future, all of us who con- | 
sider ourselves scientists will be able to query the infor- 
mation system from computer consoles in our offices, 
according to indexing schemes prepared to fit the associa- 
tion patterns in our minds at the time of the query. 
Development of such a system would require as much re- 
search into the thought processes of scientists as it 
would research in systems design. 

In the nearer future we will undoubtedly have to be 
content with indexing schemes prepared in advance of 
specific queries. Such relatively fixed indexes and clas- 
sification schemes should, to be most useful, be based on 
prior research into the category systems and associa- 
tion patterns implicit in the minds of the potential users. 
Such research should investigate not only the categories 
&nd associations common to most of the potential user 
group, but also the range of idiosynerasy in user informa- 
tion-needs. Тће deviant thinker may well be the more 
creative scientist. | 


9 Three Caveats 


There are three caveats that must be entered against 
this view of information retrieval as a receiver-controlled 
system and the concomitant suggestion that systems 
should adapt to users rather than users to systems. 

One is the minor point that any information system 
can provide only what is available in the system, and to 
that extent is source controlled. 

The second is that probably not all idiosyncratic search 
behavior should be accommodated. We may have to ac- 
cept that a busy senior scientist with somewhat fixed pat- 
terns of information seeking may not be able or willing to 
change those habits. But we may want to teach younger 
or more flexible scientists more efficient search techniques. 

The third caveat is that scientists do adapt to new 
communication systems. Just as the introduction of auto- 
mobiles in our societies has profoundly influenced our 
social behavior, so new information technology will con- 
tinue to influence our communication behavior. For ex- 
ample, it is usually clear at meetings such as this that 
jet air travel has considerably changed our use of both 


informal апа formal interpersonal channels for exchang- 
ing scientific information. к 


9 User in System 


Given these caveats it might be well to consider an 
alternate formulation. Another way of viewing the in- 
formation-retrieval problem is to view it as a larger sys- 
tem in whieh the users are seen as part of the total in- 
formation exchange system instead of outside it. In this 
larger system the problem becomes one of adjusting 
appropriate subsystems such that the flow of information 
to the scientist from all sources is in some way optimized. 

One defect in this broader formulation is that it is ex- 
tremely difficult, if not impossible, to specify criteria for 
the evaluation of such systems. Perhaps it might be 
possible 'to adequately specify “performance require- 
ments” of engineers or scientists working on well-defined 
tasks. In such cases it should in principle be possible to 
evaluate:the information flow within the system accord- 
ing to how well the engineer or scientist meets those 
performance requirements. I am less optimistic about 
such criteria for basic scientific research, however. Can 
we adequately measure scientific productivity? Scientists 
engaged їп basie research implicitly set thejr own “per- 
formance requirements” by their behavior. This implicit 
setting of criteria within the system makes it difficult to 
visualize adequate explicit, external criteria for the evalu- 
ation of such systems. 

Nevertheless, regardless of whether the user is viewed 
as inside or outside the information system, one con- 
clusion is clear. That is the need for considerably more 
psychological research dealing with human factors in in- 
formation-retrieval systems. There are several subareas 
of psychological research that seem particularly relevant. 
One is the physiological-perceptual research area that 
James С. Miller pointed at in his comments at the Airlie 
Symposium when he outlined the kinds of responses a 
human makes when confronted with an overload of in- 


formation inputs, Another is the tradition of human fac- 
tors research in man-machine systems. Another is the 
social psychological study of user needs and use of in- 
formal and formal communication channels. Still another 
is the area of cognitive theory and verbal behavior, which 
is highly relevant to the problem of developing classifica- 
tion schemes and association patterns that fit those of 
the users. | 


9 Education 


Those institutions engaged in teaching or research in 
information science, that have not already done во, might 
well consider hiring psychologists interested in informa- 
tion problems to supplement their present faculties and 
staffs. And since the demand for behavioral-science- 
trained information scientists is likely to exceed the 
supply for some time to come, some institutions might 
well consider introducing a PhD level program of educa- 
tion in this specialty. | 

It should be clear that information science can benefit 
from the detailed study of the most versatile information 
processing system ever developed or discovered — the 
human organism. 
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А Clearinghouse for Scientific ‘and Technical Meetings: 
Organizational and Operational Problems 


_ There has been much interest expressed in the forma- 
Hon of а: clearinghouse for scientific and technica! 
meetings. In spite of this interest, no clearinghouse 
has been formed. Whether or not there is a valid 
need for the existence of such a clearinghouse has not 
been proven, but in this paper it is assumed that the 
need exists. Discussion of the organizational and 
operational problems: involved follows. These prob- 


lems include: (a) definition of area and level of . 


coverage of the subject matter of any given meeting; 
‚ (b) definition of the geographic area from which the 
meeting will draw its attendance; (с) definition of 


the area of interest of the clearinghouse in the face 


The first complaint about meetings was almost un- 
doubtedly made by Ptolemy at the time that Alexandria 
was a center of learning. It was at Alexandria that science 
began to develop along the line of “specialties” (1). It 
is, of course, specialization that has caused the informa- 
tion explosion of which the “meetings problem" is but one 
manifestation. А 

The most common complaint about meetings is that 
there dre too many of them and that there is too much 
duplieation and overlap. The usual cure offered is that 
a’ clearinghouse be set up that will enable the organizers 
of meetings to determine whether or not a similar meet- 
ing was already planned for a similar audience at a 
similar time (2, 3, 4, 5). In theory, a potential meeting 
sponsor, finding that such a conflict existed, would either 
call off his projected meeting or combine it with the 
conflicting meeting. While the idea is a valid one and is 
conceptually simple, it is by no means simple from the 
standpoint of organization and operation. It is these 
factors that I will discuss at some length. My discussion 
will be based on the experience I have gained in three 
years of planning and directing Technical Meetings In- 
formation Service (TMIS). I originally envisioned TMIS 
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of difficulties imposed by mission- or project-oriented | 
meetings and other interdisciplinary meetings; (d) 
attaining comprehensive coverage in spite of the diffi- 
culties in obtaining inputs from the organizers of gov- 


~ ernment classifled and other closed meetings, as well 


as from the organizers of ad hoc meetings; and (e) 
the need to ensure that the organizers of meetings will 


make use of the clearinghouse. Solutions to these · | 


problems will not yield to a simple approach, but will 
be obtainable only on the basis of careful study of the 
structure and dynamics of the national and interna- 
tional scientific and technical meetings network. 


HARRY BAUM 


Technical Meetings Information Se vice 
New Hartford, New York 


as а, clearinghouse. It fell far short of that goal. The 
reasons will be apparent in the succeeding discussion. 

Before getting on with the main discussion, however, I 
must clarify my position with regard to the. complaints 
that exist about proliferation and overlap of meetings. 

I neither agree nor disagree with these complaints. I 
believe that they are based, for the most part, on intuition 
rather than fact. A fair amount of contrary evidence 
does exist. For example, in the biomedical field a recent 
study by Orr and others (6) concluded that: “The large ` 
increase, over the past few decades, in the number of 
meetings at which biomedical research is reported has : 
not exceeded the increase in the number of scientists 
engaged in such research and is a direct consequence of 
this growth in manpower ... ." Similarly, in the. field 
of psychology, a report by Compton (7) states: “Since 
membership [in the period from 1936 to 1961] has in- 


creased ninefold, an increase in programmed events - . 


could be anticipated, though the latter has, in fact, not ` 
kept pace with the gain in membership.” A study_ by 
Pendray (2) of technical meetings in the flight sciences. 
concludes: “After careful reading of the programs of 
eight major societies in the flight sciences, and detailed 


comparative study of the programs of four of these só-. . 


cieties . . . we have come to the conclusion that there is 
relatively little, if any, overlapping of specific subject. 
matter in the meetings. of these societies.” 

I cite’ these contrary opinions, not to undermine the 
case for a clearinghouse, but rather to put what is often 
considered' the primary reason for the existence of a 
clearinghouse in perspective. My own belief is that the 
песа for a clearinghouse will become increasingly urgent 
in response to several clearly observable trends within 
the scientific and technical community. First, there is the 
growing interest in the applicability of the methods and 
results of other disciplines to the problems within a given 
discipline.:Second, there is increasing emphasis on areas 
such as environmental science and engineering, and aero- 
space science and engineering that cut across almost the 
complete ‘spectrum of science and engineering. Third, 
there is the rapid expansion of engineering and science 
into the public sector. I believe that these factors are 
causing a| shift in the social structure of the entire sci- 
entifie/technieal community. As a result of that shift, 
the neat compartmentalization of science and technology 
that had ' formerly permitted us to keep track of any 
given discipline is being destroyed. The problem of keep- 
ing track of the movement of science and technology is 
inereasing by an order of magnitude as а result. While 
there may have been little confliet among meetings in 
the past, this nice state of affairs cannot be expected to 
continue unless a mechanism is provided that can cope 
with the ‘increasing order of complexity. The clearing- 
house is such а mechanism. 

But prevention of conflict among meetings should 
certainly | поё be considered the only function of the 
clearinghouse. А great amount of information is exchanged 
via meetings. They serve both as a formal and an informal 
medium for scientific and technical intercourse. The cost 
to the scientific and technical community of meetings 
is in the neighborhood of $1 billion per year. And, in 
terms of the sociodynamies of science and technology, 
a recent report on information use among scientists and 
engineers reveals that we may һауе seriously underesti- 
mated the importance of oral communication (8). A 
clearinghouse, in addition to serving to prevent conflict, 
could also serve as a central source of information about 
meetings’ for the use of the entire scientific/technical 
community. Perhaps most important, it could be the key 
information-gathering arm of a much-needed organization 
designed' to investigate the sociodynamics of meetings 
with a view toward enhancing their efficiency as a com- 
ponent of the scientific and technical information ex- 
change complex. 

Having finished with my philosophical discursion, I will 
get down to the main purpose of this paper: the discus- 
sion of the organizational and operational difficulties in- 
volved in a clearinghouse. These difficulties include: (a) 
definition of area and level of coverage of the subject 
matter of any given meeting; (b) definition of the geo- 





| 


“graphic area, from which the meeting will draw its 


` attendance; (c) definition of the area of interest of the 


clearinghouse in the face of difficulties imposed by mis- 


: ssion- ог project-oriented meetings and other inter- 


disciplinary meetings; (d) attaining comprehensive 
coverage in spite of the difficulties in obtaining inputs 
from the organizers of government classified and other 
closed meetings, as well as from the organizers of ad hoe 
meetings; and (e) the need to ensure that the organizers 
of meetings will make use of the clearinghouse. 


* Definition of Area and Level of Coverage 


The first requirement for a clearinghouse is some 
method of classifying the technical content of the meet- 
ings. The difficulty of this task depends on the use for 
which the clearinghouse is intended. If it is intended 
merely as a conflict-resolution mechanism, then a simple 
coarse-grained subject-classification system similar to that 
recently adopted by the Committee on Scientific and 
Technical Information (COSATI) (9) would probably 
suffice. Such a classification, coordinated with information 
on date and geographic location, would serve as & coarse 
filter to screen out meetings that might be in conflict. 
Detailed examination would then indicate whether or not 
a true conflict does exist. 

On the other hand, if the clearinghouse is to be used. 
as a source of information on meetings for the use of 
the scientific and technical community as a whole, then a 
system capable of much more sophisticated discrimina- 
tion is required. A system of subject indexing using tens 
of thousands of terms would probably be needed. Some of 
the broader meetings would probably require hundreds 
of terms to adequately describe their content. In addition 
to categorizing meetings by subject, it would probably 
be necessary to classify them by level of treatment of 
the subject. This type of classification has been discussed 
in some detail by Savage (3). 

If the latter type of classification is to be used, the 
clearinghouse must have available to it, depending upon 
the degree of specialization of the meeting, anything 
ranging from merely the title of the meeting (if it is 
extremely specialized such as “Conference on the Conflicts 
concerning Two-Gas Atmospheres and Artificial Gravity 
for Space Flight”) to the title of every paper to be 
presented and some indication of its degree of technical 
sophistication (for a very broad meeting, such as the 
Annual Meeting of the American Association for the Ad- 
vancement of Science). Such a system would make pos- 
sible a sophisticated information-retrieval system, and 
probably more important, a truly workable means for 
alerting the community to meetings of interest. 


• Definition of the Geographie Area From Which 
а Meeting Will Draw Its Attendance 


Meetings are geographically categorized as local, re- 
gional, national, or international. While the meaning of 
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each of these words, per se, is reasonably unambiguous, 
they are not sufficiently precise to permit their use in а 
clearinghouse intended for confliet resolution. À meeting 
designated by the sponsor as "international" may draw 
only a handful of attendants from outside the country 
of origin. Societies holding a “national” meeting each year 
find that the number of attendants is strongly dependent 
on the location of the meeting. Some meetings designated 
as regional may draw attendants from all over the nation. 
(One such meeting is the “Pittsburgh Conference on Ana- 
lytical Chemistry and Applied Spectroseopy.")' 
|. What is really needed to enable a clearinghouse to func- 
tion as a conflict resolver is a description of the expected 
distribution of attendance. Obtaining such a statement is 
much more difficult than obtaining a simple statement of 
the overall area from which attendance will be drawn. 
I feel, however, that information on attendance distribu- 
tion is essential to: the | proper functioning of the clear- 
inghouse. 


® Defining the Area of Interest of the Clearing- 
house 


‘Any workable clearinghouse must have a limited 
franchise. Certainly any attempt to cover all meetings, 
on all subjects, anywhere, would be more than unwieldy. 
My own guess of the number of meetings that would be 
.involved by such an area of interest would place it in 
the order of hundreds of thousands, perhaps even millions, 
per year. Limiting the franchise to regional, national, and 
international meetings would probably reduce the num- 
ber by one or two orders of magnitude, to the tens of 
thousands, Further limiting the franchise to scientific and 
technieal meetings would reduce the number to the high 
thousands or the low tens of thousands. Even this num- 
ber, however, might prove unwieldy if information in 
depth were to be made available. It is at this point, un- 
fortunately, that further restriction of the franchise be- 
comes diffieult. The diffieulty arises from the fact that 
the clearinghouse must deal with complete meetings, and 
that many meetings cut a very broad swath through 
the combined fields of science and technology. This prob- 
lem is comparable to that which would exist if Chemical 
Abstracts or Biological Abstracts were required to deal 
with complete journals rather than with individual 
papers. 


Perhaps an example will help make my point a bit. 


clearer. Let us consider a discipline-oriented franchise — 
electronies. Eleetronies would have to include radio and 
radar astronomy; biomedical engineering and instrumen- 
tation; geosciences instrumentation; psychology em- 
bodied in human factors and display engineering; agri- 
culture, forestry, and photogrammetry as involved in 
remote sensing of environment; meteorology as involved 
in weather radar; nuclear engineering and physics as in- 
volved in power generation; statistics of quality control 
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and failure analysis; space sciences as involved in com- 
| puters, navigational instruments, rendezvous control, 


electric propulsion, etc. I could, of course, expand the list, 
but it already encompasses, in addition to other branches 
of engineering, disciplines in the physieal sciences, geo- 
logical sciences, biological sciences, and even the be- 
havioral sciences. That should illustrate my point well 
enough. 

Choosing а mission-oriented franchise would compli- 
caté the problem at least as much. Consider, for example, 
the problem of choice that any of the following "missions" 
would entail: (2) environmental science and engineering, 
(b) eommunieations and information, (c) space science 
and engineering.: | 

Му own tendency in deciding what to encompass in 
Ше TMIS Technical Meetings Index has been to admit 
all of science and technology. For a publication such as 
the Index, where comprehensiveness of coverage is a goal 
rather than a necessity, the choice has, so far, been 
practical. For a clearinghouse designed for conflict re- 
duction, however, comprehensiveness within the area 
of the franchise becomes virtually a necessity, and the 
choice of that franchise requires careful analysis. 


* Problems in Obtaining Inputs 


The problem of obtaining comprehensive information . 
on future meetings resolves itself into two phases. The 
first is learning of the existence of a meeting; the second 
is obtaining detailed information on the meeting after 
its existence is known. 

The simplest part of the first problem is learning of the 
regularly recurring meetings of the established scientific 
and technical societies. One can rely on their being held 
every year at about the same time. The time and place 
of occurrence are usually set two or more years in ad- 
vance. Even this problem, however, is complicated by the 
fairly rapid emergence of new societies. 

Many societies, in addition to their regular meetings, 
will call a number of special, ad hoc, meetings during 
the year. These meetings are usually smaller than the 
"regular" meetings, and are organized on a shorter 
schedule, usually in the range of six months to one and 
a half years. Because of their irregularity, coupled’ with 
the shorter organizational time scale, these meetings are 
much more difficult to learn of than the regular meetings. 

Of a still higher order of difficulty of acquisition, are | 
the meetings that are organized outside the framework 
of the professional societies. Organizers of such meetings 
include government, educational institutions, laboratories, 
trade organizations, and industry. Some are regular meet- 
ings; others are ad hoc. Of the latter group, one ean only 
know that such meetings will occur. It would require an 
extremely good intelligence system to enable their rapid 
acquisition. 

Perhaps the most difficult meetings to learn of are ilis 


“classified” meetings held under the auspices of the armed 
forces. While the existence of these meetings, and their 
names, is not classified information, the armed forces are 
often reluctant to make their existence publie knowledge 
for fear that attempts by unauthorized persons to attend 
will add to the administrative problems of the mectings. 
These meetings are considered to be extremely important 
ones in the, industrial/military community and should, 
I feel, be included i in any clearinghouse. 

Learning jof the existence of a meeting is not sufficient. 
One must also obtain information in detail. This implies 
that active cooperation of the organizers of mectings 
must be obtained. My experience has been that most 
organizers óf unclassified meetings will willingly give in- 
formation. (Unfortunately, there are а handful of im- 
portant organizations that are not eager to cooperate. 
Their general attitude сап probably be summed up by 
the following paraphrases of correspondence and conver- 
sations I have had. 


We keep our members informed of our meetings by 
means of notices in our journal, and we aren't par- 
ticularly ‘interested in people who aren't members. 


Meetings are open only to members of the Society and 
their guests. There is no general solicitation of papers. 
1 doubt, that making information on our meetings 
publie would serve a useful purpose, and it, might 
have the awkward effect of lending all and sundry to 
assume that they can present papers at our meetings. 


I’m afraid it would take a great deal of paper and 
reddish-purple ink to adequately convey my feelings 
about thig sort of presumptuous snobbery. My real 
complaint +~ feelings aside — is that this attitude is ir- 
responsible. It fails to acknowledge the need for informa- 
tion of the rest of the scientific and technical community, 
to say nothing of the need for information — and the 
right to it — of the general publie. 

Тће other problem area in regard to obtaining detailed 
information is the armed-forces supported classified meet- 
ings. This is not to say that the information is classified. 
Informatión about the meeting, including programs and 
abstracts 1 is usually unelassified. Many of the organizers 
are willing to give information, but many are not. Again, 
the reason given for withholding details is that public 
disclosure] of information might cause unauthorized 
people to, attempt to attend or to obtain copies of the 
papers. While these reasons do have a certain amount of 
validity, they are also open to criticism on the grounds 
that they: make information less available to workers in 
the field who should have access to it. 

'Three approaches may be taken by а clearinghouse 
to the overall problem of obtaining information on meet- 
ings. The'first is that of passive detection. In this method, 
the clearinghouse examines existing published literature 
for information on future meetings. This method involves 
a great deal of effort and has the twin liabilities of the 
use of secondary sources of information: lack of timeliness 
and greater risk of inaccuracy than in the use of primary 
sources. The second method is that of active detection. 


| 
| 


Here, possible sponsors of meetings are canvassed on a 
regular basis. This method results in greater accuracy 
and may, depending on the frequency of the canvass, re- 
sult in greater timeliness. Its disadvantage is that it is 
completely dependent on the clearinghouse’s knowledge 
of who may be expected to sponsor meetings, and on the 
willingness of the sponsor to cooperate with the clear- 
inghouse. 

The third method involves the application of pressure 
in some manner so that the sponsor will consider it 
necessary to automatically provide the clearinghouse with 
the needed information. 

For a clearinghouse that is not intended as a conflict- 
resolution mechanism, some combination of the first two 
methods would probably suffice. For a conflict-resolving 
mechanism, however, the third method would almost un- 
doubtedly be required. 


* The Need to Ensure that Organizers of Meetings 
Уш Use the Clearinghouse 


Any clearinghouse, to abiit: as a successful conflict- 
reduetion mechanism, must have the active cooperation 
of the organizations among whom it is to reduce the 
conflict. While the utility of a clearinghouse can probably 
be shown on ‘the basis of the overall “energy budget” of 
the scientific and technical information-transfer system, 
it is not a simple matter to convince any single sponsor 
of meetings that it is in his own best interest to surrender 
a portion of his autonomy to a regulative organiza- 
tion. And a properly functioning clearinghouse, even if 
not explicitly organized as a regulative body, would 
inescapably come to have that function implicitly. Most 
sponsors of meetings will probably agree that a clearing- 
house would be a good idea; but as far as they are con- 
cerned, they know about the plans of other organiza- 
tions of interest (or other organizations know of their 
plans) and they really don’t need it. Whether or not such 
an attitude is justified, is certainly not known at this time. 
(As I noted earlier, my own guess is that it is justified 
now, but won’t be for long.) Study will be required to 
determine whether or not this type of clearinghouse is 
needed. But if it is, then some method will have to be 
found to encourage active cooperation with the clearing- 
house. One possibility for obtaining this cooperation 
would be in organizing the clearinghouse as part of an 
auditing agency, somewhat like the Audit Bureau of 
Circulation in the publishing industry. I have discussed 
such a proposal at some length in an earlier paper (10). 


D 


• Summary 
I believe that a elearinghouse for information on meet- 


ings is needed now as a source of information on past 
meetings, and as an alerting service for the scientific 
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5: NATIONAL АсАрЕМҮ OF SCIENCES — Nina pos e. d 


An Operating Model of a National Information System 


^ 


The eventual configuration of any National Informa- 
_tion System will require close coordination between 
the many existing indexing-abstracting services. Re- 
duction of ‘he overlap among these services is one 
of the important objectives of a national system. North 
American Aviation, Incorporated, has developed and 
implemented a system which solves some of the prob- 


| 
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ө Introduction 


The need! for continuing improvement of our National 
Information System is recognized by government, in- 
dustry, and our institutions whose missions contribute 
to this objective. Ав a result there are in existence and 
in formulation many systems which are designed to fur- 
ther this general objective, but which differ in the 
methods and organizational structure they see as neces- 
sary to improve the national] information situation. 

The extent of this effort is impressive as illustrated 
by the existence of approximately 300 independent ab- 
stracting and indexing services supporting scientific effort 
in the United States (1). These have a total publication 
of about 2 {million citations a year. As more indexing and 
abstracting services are created, as the volume of knowl- 
edge increases, and as it becomes increasingly inter- 
disciplinary in character there will be an increasing 
problem of overlap among these services. The major 
manifestation of this overlap consists of two or more or- 
ganizations indexing and abstracting the same document. 

For some time there has been emphasis on the cost 
of nonavailabibty of information, especially that caused 
by duplication of research effort. Additional empha- 
sis needs to be given to the cost of the abstracting 
and indexing services which will provide information dis- 
semination and availability of the required degree, since 
this can be expected to increase as the quantity of in- 
formation increases. It is thus apparent that any na- 


lems of linking services or networks together so as to 
combine maximum scope of information retrieval with 
minimum indexing and abstracting effort. The tech- 
niques used in this system are discussed and some 
proposals are presented showing how these systems 
techniques can be combined with other proposed 
methods for use in a National Information System. 
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tional system, regardless of the degree of centralization 
or decentralization, must have as one of its objectives 
the reduction of abstracting and indexing costs caused 
by overlapping effort. 

Studies have been made of this overlap on open litera- 
ture. For example, it is estimated that approximatelv 
35,000 journals are published throughout the world (1). 
Data on the coverage of these journals by all the services, 
both profession and project oriented, indicates that on 
the average each one is being covered nearly four times. 

Eighteen of the 300 indexing-abstracting services are 
profession-oriented services. These 18 account for ap- 
proximately one third of the 2 million yearly journal 
citations. It is estimated that their cost alone will m- 
crease from 57 million in 1961 to $25 million in 1971. Ап 
analysis of 17,000 journals covered by 11 of the 18 pro- 
fession-oriented services in 1961 showed a 5096 overlap 
in journal eoverage among these 11 services alone. 

Although these costs are dramatic, they are probably 
exceeded both absolutely and relatively by the duplica- 
tion existing in the coverage of technical reports. This 
duplication cost extends not only through the community 
of indexing and abstracting services, but through the 
vast согорјех of libraries and information centers operat- 
ing in hundreds of companies and governmental agencies. 
It thus affects not only the profits and costs of the 300 
indexing-abstracting services and their subscribers, but 
impinges on the tax dollar and the general business profit 
dollar as well. 
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* Several promising solutions to the duplication problem |; ^ 
. have been proposed. One of these i is the use of “Modular 
. Content Analyses," proposed Әу Lancaster and Herner | 
(2). This approach consists of the preparation of ‘a: . 
' modular abstract. which contains: (1) a ‘citation; (2) 


&n annotation ;. (3) three abstracts — indicative, informa- 
tive, and eritieal; and (4) & set of modular index entries. 
This approach assumes that, one organization would pre- 


pare "Modülar Content Analyses" for documents or-: 
' articles from specified sources, or for specified subject - 


disciplines, and make these: available to other indexing- 


7 abstracting services which, with minimum editorial effort, 
. would use selected: portions for their own publication. 
Realization of the potential benefits of this approach de- 
;. pends on an organizational and administrative system of 
“cooperation. which would effect ‘agreements on: 


“who” 


. would prepare Modular Content Analyses for ute 


` subject, areas-or for specific journals; "who" would be sent · 
, copies ог these; and, what costing procedure, if any, 


would be utilized. 


GÀ plan proposed. by Robert Heller and Associates. (1) 
recognizes the importance of the latter aspect. This plan · 


would include thé creation of “Organization X" which 


would be responsible for, in effect, reusing and diversify; _ 
ing the products (possibly modular content analyses) of с ` 


the Profession Oriented Indexing-Abstracting Services. 
Its function would be to selectively disseminate these 
products to оле or more of the 270 Project (or Mission) 
Oriented Services who would use them — again with 


minimum editorial effort on their part — in their Project | 


; Oriented secondary publications. The result would-be a 


reduction in the duplication of indexing and abstracting 


“effort between: ‘Project Oriented and Profession Oriented , 
‚+ Services. The plan further recognizes the'need for an, 
Е administrative scheme of cooperative effort among the 
, Profession Oriented Services to reduce the existent inter- 
> service duplication. 


An organizational scheme somewhat similar to the “Or- 


' ganization X" proposal is that of the “National IR Net- ' 


work Coordination Center” proposed by Jonker as part 
of а proposed national system of interlinking Informa- 
tion Retrieval Networks (3). In addition to serving as 


m standards - originator and repository of indexes, search 
files, ete., this organization would search tapes from one. 


ór more networks for a given ‘requester network. The 


' product of the search would include citations and ab- 
i ^gtracts which the requesting nétwork (or service) would - 


then include in its own collection and/or publication with 


: га minimum of. editorial effort. 
„ These proposals, and others, though differing slightly - 
.im approach, represent methods for improving our Na- 


' tional Information System while at the same time re- 


ducing the costs inherent in the duplication which now 


.existe in the system. Implementation of any of these ap- 
| proaches, no matter how promising they appear, is the 


: major problem, however. The’ time and costs involved 


make it almost: mandatory to “prove” the “workability” 
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апа feasibility of the potential solutions vig à pilot- systéms . 


"ог by implementation, in. organizations - ‘or’ associations; 


~ Which have some of the. characteristics of the national 


system and: which would therefore: serve as a model оғ; 
7 mierocósm of the National macrocosm. .. WA 
North American. Aviation, Inc., has. шешене a-c 
' system having many pf the: characteristics ` of a Маг, 
system's approach is comple: ` 
mentary jn concept, but mere limited іп application than’ '. 
some of the proposals discussed above.-Ag such, it repré- `` 
sents a study in the development. and use of techniques . - 
by which some of the problems of a multivariate com-' · 


tional | system. This 


munity of- indexing-abstracting services haye: been solved. 
North American’s subnational microcosm ' ‘generates and 
utilizes information covering а broad range of products: 


` апа subject disciplines and a growing amalgam of inter- 


disciplinary tasks requiring-access to both open and closed 


literature. Fhese needs also result in variations of index- ~. 
ing approaches, especially those due to, in éffect; a шіх-. 7 


ture of project and profession-oriented indexing. 


'The divisions of North Ameriean generate apres: LAM 
. mately 8,000 technical reports each year. In: ада Шол, “^ .. 
`. the nine major libraries accession 52,000 external reports. 
per, year. These are available to users through 9 ‘main . 
libraries and 18 branch libraries. Duplication of indexing <, 


effort prior to the new system varied from 5 to 15% 


` resulting in overlapping effort on as many 45 6,000 ге-“ 


.ports' each year. The objectives of the new system were. 


to not only. solve the ' ‘information explosion" problem | 


' but to eliminate the extra costs of indexing these: qom. 
mon holdings." 
Ап exposition of this approach i 15 only Sociis in 


К the context of the North American Aviation Technical 


Information Processing System. Therefore, . we will pm 
` ceed to a general description of this system and theri^to . 


" 
v 





the methods used to eliminate the effects of operan in , 


this milieu. · 


‚ @ North American ПЕ Technical Information - 


Processing System | У 
Basically, this is a document retrieval Ж: Its óut- 


put is similar to the secondary publications of many ~*~ 


abstracting апа indexing services. У NE these out- 
puts are: 


1..Ап Accessions Catalog. This contains ‘a citation, а 
' get of descriptive terms, and an abstract for each 
- document received by "North American Aviation’ 
Technical Information Centers and Libraries. Each 
division has a unique series of accession numbers 
which make up a separate section of the catalog. . 
A Permuted Descriptive Terms Index. i 
‚ An Author Index.” 


A Document Number Index. : 
А Contract Number Índex. 


E PURIS 


А Source ааа EE Index. . 2 00700. 


In addition-to the produetion of the Casing and : : 
. Indexes, the system also includes capabilities for SDI. .*. 


(Selective Dissemination of Information) and Retro- 
spective Search and Retrieval. The computer system for 
index and catalog production was implemented in the 
summer of 1964. The SDI and Retrospective Search sub- 
systems were implemented in the summer of 1965. 

The system is analogous to the “Union Catalog" ap- 
proach in that the catalog and indexes combine biblio- 
graphic irn for the holdings of nine geographically 
separate North American divisional information centers 
and central libraries: namely, Atomics International 
Division, located: in Canoga Park, California; Autonetics 
Division, in Anaheim, California; Columbus Division, 
Columbus, Ohio; Los Angeles Division, Los Angeles, 
California; Rocketdyne Division, Canoga Park, Cali- 
fornia; McGregor Facility of Rocketdyne Division, 
McGregor, Texas; North American Science Center, 
Thousand Oaks, California; Space and Information 
Systems Division, Downey, California; and the Tulsa 
Facility of the Space and Information Systems Division, 
Tulsa, Oklahoma. 

The North American system is divided into two levels 
of processing; a| Division level system and a Corporate 
level system. The Division level system includes catalog- 
ing, indexing, and preparing abstracts (where abstracts 


ava Areahy in the document, these are used where pos- 
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sible, with modification as required). After these steps are 
performed, the information is translated into either 
punched cards or punched paper tape. The IBM 1401 
computer is used to convert the paper tape or cards 
to magnetic tape, to perform certain audit and edit func- 
tions, and to create and update a Division Master File 
which contains all the accessions for a division. Each 
division normally operates its system on a weekly basis. 
The division level output includes a shelf list and an 
option for printing 3x5 cards. Once a month a tape 
containing the month’s accessions is generated for input 
_to the corporate level system. This is then sent via micro- 


wave or telephone line transmission to the Corporate | 


Data Processing Center. 

Fig. 1 is a block diagram representation of the Cor- 
porate Level System. The system utilizes an IBM 7010/ 
1301, ie, a combination of random access disks and 
magnetic tapes, for processing. 

The first program in the corporate system merges the 


` input from the nine division level systems; performs 


various audit, edit, and decoding functions; and generates 
the tape used for printing the Accessions Catalog. The 
second major program handles only the citation and 
descriptive terms which have been placed in document 
number sequence. This program detects common hold- 





3. PERMUTATION 











D 
ATED FILE 





AND МЕ: 


Д. PREPARE AU“ HOP, 
SOURCE, AND 
CONTRACT NUMBER 

INDEXES 






KP MICROFICHE 


жа? 


Inc., indexing апа processing system 


| general computer program flow. 
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j а the records together, and. prepares а. tape 


"which is used to print the Document Number Index. · 


The third major program prepares-the Permuted Descrip- 


7 tive Terms Index and creates inverted term files which ` 
` gre used for a term, postings réport and retrospective 


- searches, The fourth major program prepares the Author, 
7 Source, and Contract Number Indexes. The output of all 
"these programs is a “print tape” which is in SC. 4020 (a 


microfilm recorder) language format. These . tapes are ез 
5 crun through the SC 4020 which prints the indexes and . 


7 catalog on film at the rate of 6,000 lines per minute. This 
film is then used to directly create offset mastérs via 


Xerox copyflo. The published indexes and catalog are. 


` then printéd using normal offset print procedures. Out- 
put options include the automatic preparation of micro-. 


fiche from the film output so as to reduce the hard copy ` 


printing costs for those divisions which have the neces- 
^. sary reader/printers. Bv i 


LJ Cooperative Indexing Scheme: 


` In order. to realize the cost savings provided. by the: 
. Capabilities of the system, it was necessary to establish 


| са corollary: administrative- system. The administrative 


- system ів. referred to ав the “Cooperative Indexing”. 


. “scheme. 


· To implement the ‘cooperative indexing scheme; it is. | 


first necessary to assign indexing responsibility to a di- 
visión. This is achieved by first determining which di- 
.., Vision receives all of the reports from ‘a given source or 


* feceives a greater quantity: and coverage of the output . 
of a given source than any other division. The division is. 
. then contacted to verify if this, in fact, reflects а true . 


. discipline or mission interest. A division which meets these 


. criteria for a given source can then be assigned the.re- 
^ sponsibility: for indexing and preparing abstracts ag.. 


necessary for all documents receivéd from’ that source. 


Sources for which ‘the above determination can be . 


made are designated as “Common Sources.” The list of 


common sources and the division responsible for each . 
- one is coordinated periodically and disseminated to each : 


division. When "nonresponsible" divisions receive docu- 
ments from a common source, they prepare what is те- 
. ferred to as a skeletal input. This consists of that di- 


. vislon's - accession number, the source eode, and the. 


report number. The skeletal input шау also include 


: information that is unique to each division, such as branch. < 


"library location, number of. copies; etc. In order to. make. 
.this approach feasible, a standard code for sources was 
developed. The provision for skeletal input’ of common: 


, holdings is one of the major reasons for the use of such а. | 
Standard ‘code. (The other reasons include achievement : 
‚ of consistency of'sources so that a “clean” Source Index is - 


Possible. and reduction in ‘the punching and processing 


above situation. If a document from any source ig already 


in the index — having been previously accessioned by ` 
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Usage of the skeletal input is not restricted to the ` 
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another division — any .other division receiving this. 


document can then use the skeletal input. This can be 
‘easily determined: by looking in thé Source Index -or 


Document Number Index. wt Sed | 


• аи Techniques and Methodology. (4) 


“Шш addition-to the general functions of master-file пр-. 
dating and producing an Accessions. Catalog, the first - 


major program of the NAA System, performs certain. 


other operations: One of these is to generate à document, 
‘number record containing everything but the abstract, for 
each. bibliographie reference. During this process “the 
accession number field, which is normally eight positions, 

is expanded by one position to include a “completeness 
code reflecting the degree of completeness of the biblió- | 
graphie reference, The program automatically assigns one 

of the following four “completeness” - codes: . e 

0--Тһе input for this accession number contains no 

. descriptive terms and no abstract: (This would be 

_ a skeletal input.) | ЭЖ е 

1=The input for this accession numer. contains -no 
abstract. but ‘does have descriptive terms. (This 
would occur when а “nonresponsible” division has a 

- special interest in the document’ and may desire 

а different indexing approach for that. document.) 
2=The input for this accession number contains “an - 

| abstract, but, does not ћауд descriptive’ terms. (This 
would not normally occur, but if it does, the user 
would be ‘referred to the citation containin g-an 
abstract.) ` Ие а TE 

., 8=The input for this accession number contains both: 


, 


"an abstract and descriptive terms. — , . . 
These document number recordg are then sorted iń.a 
modified sort program. Phase I of this program expands 
the document numbér, field into subfields of 12 char- 


. acters each with leading zeros, Which, in effect, right 


justifies each number group in the document number. 
For this purpose, the program detects’ а, numeric’ süb- 
field іп scanning from left to right as any group of num- 
bers following а, special character (such аба dash) or any 
group of numbers following an alphabetic character, 
Phases II and III of ‘the modified sort program consist 


. of normal sorting procedures. 


In the second major step, the “Document Number i 
Update Program," the records created: above “ате "used · 
to update a document number master file, which is in 


input, or one or more documents in the input which àre 


„the same as one. or more documents ‘already. im the. 


master file. 


gm a Е "dh 


three divisions who have accessioned the same document. 
The division with the accession number L-084486 (the 
letter prefix designates the division) has prepared a com- 
plete input represented by Record 1. Division A in Record 
2 has prepared a skeletal input and, in addition, has as- 
signed descriptive terms. This situation could oceur for 
several reasons, e.g., perhaps Division L has been assigned 
indexing responsibility for the particular document but 
Division A has a different indexing orientation due to a 
difference in mission and product. This may result in 
unique requirements for access io documents, which in 
turn require the use of certain terms (which may be 
division idioms or may be more specific terms resulting 
in greater indexing depth) which Division L does not 
normally use in indexing. Record 3 is a skeletal input 
from Division H. Because the system works with variable 
length records it is possible to add one or more deserip- 
tive terms and one or more accession numbers to any 
record. 

The function of this part of the second program is to 
select the record to be saved — and added to — based 
on the highest completeness eode. Since Record 1 has the 
highest completeness code 15 will be saved. Іп addition, the 
descriptive terms are compared as the records are merged. 
If the same term is found in all the records, the one 
contained in the record with the lower completeness code 
is deleted. If the terms do not match, the term from the 
latter record is added to the terms contained in the "save" 
record. The result of the merger is Record 4, which con- 
tains three accession numbers, the accession numbers 
from Records 2 and 3 having been added to Record 1. 

|The completeness code is used for an additional pur- 
pose in printing the indexes. When a user locates a 
potentially relevant document in any of the indexes, it is 
to his advantage to be able to refer to the bibliographic 
reference containing the most information about the docu- 
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L-OS4465. 


‘ment. To assist him in this, the accession number with 
the highest completeness code number is always preceded 
by an asterisk in the indexes. 

- Fig. 3 shows an actual example of the merger of a docu- 
ment accessioned by three divisions during the first 
months of system operation. Extracts from the Permuted 


· Descriptive Terms Index, the Author Index, and the 


Document Number Index are also shown to illustrate 
how common holdings are printed in the indexes. 
The Permuted Descriptive Terms Index contains the set 
of descriptive terms and the document number and 
accession numbers for each document. The other in- 
dexes contain the title of the document plus the 
document and accession numbers. (The astute ob- 
server will note that our Indexing Training Course 
had not yet been completed when these inputs were 
prepared.) In this case, if you knew the report num- 
ber, or if you were looking for documents by Mr. 
Gouse, or if you were searching for documents covering 
the subject term Gas кроз, you would see references to 
the three accession numbers. Note that the “Т,” .acces- 
sion number would have the highest completeness code 
since it contains both an abstract and descriptive terms. 
Therefore, it is always preceded by an asterisk in the 
indexes. The completeness code for L-084486 would be 3. 
For A-000548 it would be 1, and for the skeletal input 
H-010309 it would be 0. This example also shows what 
happens when more than one division includes a title or 
an author in its entry. In this case the author and the 
title under the “L” accession number is saved. The “A” 
division’s citation contains a contract number which is 
not contained in the “L” citation. Therefore, the merged 
record will also contain this number. The effect of the 
merger of descriptive terms can also be seen in the 
extract from the Permuted Descriptive Terms Index. 

If a user from Division H came upon this document 


RECORD 4 DOCUMENT | SOURCE TITLE AUTHOR Рив. | CONTRACT DESCRIPTIVE ACCESSION ACCESSION ACCKASION 
OUTPUT o NUMBER CODE DATE TERMS NUMBER NUMBER NUMBER 
iMergedi „0446. (A-0003 481 њотозов 


А Fic. 2. Merger of common holdings. 
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PERMUTED DESCRIPTIVE TERMS INDEX 


GAS FLOW 


#AIR, ЖАММОМТА, COMPRESSIBLE FLOW. ODUCT, €GAS FLOW, 
NITROGEN, OXYGEN, STEAM, 


NA64- 368 M6465626 










«Ғісы, GAS FLOW, SOSCILLATIONS, *TWO PHASE, FLUID FLOW, 
САВ, HEAT TRANSFER, LIQUID, LIQUID FLOW, LIQUID PHASES, 
PRELIMINARY, SURVEY, 
о5К' 754-5 





H-010308 А-00054& 91-08446 













ЖСА5 BEARINGS, *GAS FLOW, *HYDROSTATIC PRESSURE, 
RLUBRICATION, 2У150005 FLOW, ANTIFRICTION BEARINGS, 
FRICTION FACTOR, 

TR-32-1 





А-00068& 





AUTHOR INDEX 









GOUSE, B. М. JR. 
TWO PHASE GAS LIQUID FLOW OSCILLATIONS, PRELIMINARY SURVEY 
DSR-8734-5 н-010509 А-000546 21.-064426 





DOCUMENT HUMBER 
н-010264 
н-0103599 
H-010294 
H-D10295 
H-01044T 
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H-010448 
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H-010377 
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H-010125 
М6465647 


DRD-N-6932 
DKD-N-6962 
DRD~N-6965 
DRO-N-6967 
DRO-N~ 6970 
"DRO-N-6961 
*DRD-N-6993 


"DRO-N- 7032 
DREXEL PROJECT 195 
DSR 8734-3 
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DsR 9649-1 
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L-084486 + UNCLASSIFIED 

TWO PHASE GAS LIQUID FLOW OSCILLATIONS, PRELIMINARY SURVEY 
BY- GOUSE, $. W. JR. 

M. 1. T. DEPT. OF MECHANICAL ENGINEERING 
DOCUMENT NUMBER DSR-8734-5 PUB РАТЕ — -JUL-64 
DESCRIPTIVE TERMS- 9FLOW, «OSCILLATIONS, *TWO PHASE, GAS, 
LIQUID, PRELIMINARY, SURVEY, . 


А REVIEW OF А REPRESENTATIVE NUMBER OF REFERENCES FOR 
VARIOUS TYPE OF TWO-PHASE GAS FLOW OSCILLATIONS HAS BEEN 
CONDUCTED. THE PRINCIPAL CONCLUSION IS THAT AT THIS TIME 
THERE I$ NO RELIABLE WAY TO PREDICT THE ONSET OF, MAGNITUDE 
OF, FREQUENCY ОҒ, AND DISAPPEARANCE OF FLOW OSCILLATIONS IN 
TWO-PHASE FLOW SYSTEMS. А CONSENSUS OF THE EXISTING 
EXPERIMENTAL RESULTS INDICATES THAT THE TENDENCY FOR А SYSTEM 
ТО OSCILLATE CAN BE REDUCED BY REDUCING THE INLET SUBCOOLING, 
INCREASING THE SYSTEM PRESSURE LEVEL, ELIMINATING HEATED 
SECTION EXIT RESTRICTIONS, DECREASING FLUID LEVEL IN A RISER, 
IF PRESENT, AND INCREASING THE FLOW RESTRICTION AT THE HEATED 
SECTION INLET. IN ADDITION, 17 15 BELIEVED THAT USEFUL 
STABILITY MAPS CAN BE PREPARED THAT INDICATE REGIONS OF 
OPERATION In WHICH THERE WILL BE NO FLOW OSCILLATIONS. FOR А 
GIVEN FLUID SYSTEM, THE ҒАРАМЕТЕК5 NECESSARY TO DETERMINE © 
THIS МАР ARE GEOMETRY, INLET SUBCCOLING, FLOW RATE AND HEAT 
FLUX, 


UNCLASSIFIED 
TWO-PHASE GAS-LIQUID FLOW OSCILLATIONS PRELIMINARY SURVEY 
BY- GOUSE, ЈК., S.W. 

M. 1. T. DEPT. OF MECHANICAL ENGINEERING 
DOQUMENT NUMBER DSR 8734-5 


CONTRACT NONR-1841/T3/ 


PuB DATE - 1-64 


DESCRIPTIVE TERMS- *GAS FLOW, &SOSCILLATICNS, FLUID FLOW, HEAT 
TRANSFER, LIGUID FLOW, LIQUID PHASES, 5 


H-010309 к ; 


M. I. T. DEPT. OF MECHANICAL ENGINEERING 
DOCUMENT NUMBER DSR 6734-5 





; Ела. 3. Extracts from catalog and indexes. 


via a search of any of the indexes he would turn to the 
“L” Accession Number bibliographie reference in the Ac- 
cessions Catalog to find the most complete information. 
If, after reading the abstract, the document appeared 
to be relevant to his need, he would then.check it out from 
his own library, requesting it under the “H” Accession 
Number. With this capability in the indexes, although 
the holdings of all divisions are retrievable, the quantity 
of documents ordered from other divisions is decreased 
significantly. 
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9 Management Information Aspects 


One of the first problems encountered in assigning 
indexing responsibility was that, in the previous non- 
integrated manual systems, it was impossible (without 
extensive expenditures of time) to determine the degree 
of overlap among divisions. It was also difficult for any 
given division to determine whether it received all of the 
documents published by a given source or whether 
it at least received more documents from a given source 


tham any other division. Again, the common holdings 
technique solved this problem by providing information 
from which decisions on indexing responsibility could 
be made. In effect, this aspect of the system represents a 
Íacet of management information, at least for the pur- 
pose of the cooperative indexing scheme. The primary 
vehicle utilized for this purpose is the Source Index. 
After a large enough corpus of bibliographie references is 
built up for given sources, it is possible to determine an- 
swers to some of these questions. This is done by referring 
to each source in the Source Index. If every document 


. listed under a given source carries the accession number of 


a given division, but only some of the documents also 
earry accession numbers for other divisions, it is possible 
to tentatively conclude that the former division should 
be assigned indexing responsibility for that Source. The 
conclusion is necessarily tentative at this point since it 
presumes the equating of coverage with subject compe- 
tence or mission responsibility. The same approach can 
be used to decrease acquisition cost since the division 
which has the 100% coverage can in some cases perform 
the acquisition for other divisions. 

"The approach is illustrated by an extract from the 
Souree Index shown in Fig. 4. Assuming the samples 
were of significant size, we can see that Division A has 
аесездопед every document from Ohio State University. 
No tentative decision on indexing responsibility is pos- 
sible for the Ohio State University Research Foundation 
since the three documents received during the period 
covered by this issue of the index were received by threc 
different divisions. 
| 
€ Production of Mission or Discipline Oriented 

Indexes 


One of the prime objectives of the North American 
information system is to make all of the technical in- 
formation held by all North American libraries available 
to any North American engineer (except for those docu- 
ments whieh may have restrictions оп them). Because 
of this we have taken the union catalog approach. How- 
ever, the same processing system with a slight modifica- 


tion could be used to provide selective indexes by Mission 
‘(or project) or by Discipline (or profession). This could 
.be accomplished through the feedback loop created by 


closing Switch А in Fig. 1. This loop would consist of 
utilizing the eommon holdings technique to add the ab- 
stract, ete., from the "complete" input of a responsible 


indexing .division to the skeletal inputs from other di- - 


visions. The result would be a eatalog containing 100% 
complete bibliographie references for each division. Тће 
indexes would contain index entries for the holdings of 


' each division plus indicating which documents were also 
held by other divisions. This сап be achieved by a simple ` 


"select" program for the indexes which would select only 
index entries which contain that division's accession num- 
ber series. Thus, if we assume that several of the divisions 
were Discipline or Profession oriented and that they in- 


SOURCE INDER 


OHIO STATE UNIVERSITY 


OHIO STATE UNIVERSITY 


ANNUAL SUMMARY REPORT - І MARCH 1961 TO РӘ FEBRUAR oA 
R-1093-8 -000946 


ELEMENTARY INTEGRATED DIRECTICN-FINDING SYSTEM 
R-1566-12 L-084418 


ETCH PIT INVESTIGATION OF IRON WHISKERS 
R-63-D-O1 


А 
INTERIM ENGINEERING REPORT - 1 JUNE 1964-21 AUGUST 1964 
R-1568-13 у (C A-001161pL-085145 


PROCEEDINGS OF THE OSU-RTD SYMPOSIUM ОЧ ELECTROMAGNETIC 
WINDOWS - VOLUME ty 
R-64-F-04 VOL 4 


SEMI-ANNUAL REPORT - 1 MARCH 1962 TO 31 AUGUST 1962 - 
RECEIVER TECHNIQUES AND DETECTORS FOR USE AT MILLIMETER 
AND SUBMILLIMETER WAVE LENGTHS 
R-1093-10 


TECHNIGUES FOR INTEGRATION OF ACTIVE ELEMENTS INTO 


ANTENNAS AND ANTENNA STRUCTURE 
L-084044 (4-000192) 


R-1586-11 


TIN-FILM SUPERCONDUCTING DOLCME TER 
R-1093-5 


OHIO STATE UNIV. RESEARCH FOUNDATION 


CONSONANT INTELLIGIBILITY WITH SELECTED VOWELS IN QUIET 
AND NOISE. 


MISCELLANEQUS63-37 M6311592 


GEODESIC LENS FEEDS AND FLUSH MOUNTED MULTIPLE FEED 
PERFORMANCE IN A GEODESIC LENS. 


1394-12 H-009015 


TO DEVELOP METHODS FOR MEASURING THE PROPERTIES OF 
PENETRANT FLAW INSPECTION MATERIALS 


050-6420-2-64 RT-00446 


Fic. 4. Example of use of source index to assign indexing 
responsibility. 


dexed all documents in their field, and if we further as- 
sumed, that other divisions were Mission or Project 
oriented and that they submitted skeletal inputs for docu- 
ments which were of interest to them which had been 
indexed by the Discipline oriented divisions, the end 
result would be the utilization of the indexing prepared 
by subject specialists by the Project oriented divisions. 
Many other variations are possible with minor modifica- 
tions in the system. For example, a given division might 
want its indexes to include every available document on 
a given subject. In that case, the entries which would 
be printed in those indexes would be selected by matching 
inputs from all divisions against each division’s subject 
profile. Those which met the criterion would then be 
printed in that division’s index. Within the purview of 
all the various proposals for national information systems 
and information networks there are many system re- 
quirements and objectives which could easily be handled 
by some variations of the North American system. 
These include the present capability to detect periodi- 
cal “common holdings” (using the source code for the 
journal name and the volume, issue, and page numbers 
as a document number) and the capability to use the 
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Е same "T of merge Ее tr boss (using author, 
‘title, апа рано year for matching). 


Zn Combining ће Modular Content, Analysis and 


the Common Holdings Approach 
An extrapolation of a common holdings ene 


used in combination with modular content analyses would ` 


appear to be quite feasible. This possibility i is Suggested 


"сазове of the, possible means of realizing the -cost ad- 


і ‘vantages’ inherent in modular content analyses. 
| The. proposed modular content analyses contain three | 
. types of abstracts and. two levels of index terms (thé 
upper level could presumably be а subject category).. 


1f the modular eontent “analyses were in machine readable 


form (e.g., paper tape or magnetic tape) and had certain . 
standard internal computer assigned codes identifying ` 
éach element ‘of information, they could be sent to other: 
_abstracting-indexing .вегујсев on an exchange or sub- 
„вепрбоп basis. When another indexing service receives | 
~ the tapes; it would have a set of established rules to- 
~- apply in selecting the information it wishes: to include in 


.its own secondary publieation or wishes to add to its 


own. search files. The first test in this program would” 


be to determine whether théy wish to include the modular 


E ` content analyses in their collection. This could be done : 


in several ways. In some cases they might want to include 


oc every document issued by в given source. If so, this, 
'. would. be one of the acceptance criteria. Or, they might 


wish to include only information which met their own 
interest profile. If so, a normal search procedure would 


"be used in determining this selection. ` After the initial 


decision was made, the program ‘would then Have to select 
the parts of the modular content analyses ‘desired. Assum- 
ing that each of the three abstracts was identified in the 


; - machine readable media with a single digit identifying. 


code, the program would nierely select the code which had 


, been assigned to the abstract it desired. If each higher’ 
level index term was also assigned a standard code (this 
.' would be simple if a subsumption scheme or a ‘subject 
‚1 category scheme having a limited numberof categories 
' was used), the program could contain provisions for 


' selecting ` specific terms . subsumed under some of the 


. ` headings, but possibly selecting only the more general 

. terms for some subject areas. It would thus be possible for 
the usirig organization to select the depth of indexing they 
-desired for each document. Such elements as author, 


` ‘title, ete.; would be automatically.. selected whenever the 
sae criterion was met. 


| s Use оғ T» Technique to , Тар Files fion Оше: 


Indexing-Abstracting Services 


· Since: North American Aviation is à space БОЙКО : 


“othe. NASA master files.(representing the content of the 


i STAR and IAA indexes) are available for use. In addition, . 


` tape subscriptions aré now becoming available from other  . 
indexing-abstracting services..Such tapes will be used . Bs 


са 


| : "for searches but, in addition, they will in some cases 
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_ ђе used to''reduce indexing costs even further by treat- 
ing NASA or any other service involved as а “responsible · 
` division? for documents which they. also index. In ‘most: 


cases, this will have only limited application ‘since. Нове ` 
divisions which obtain microfiche for most or all of. the 


„© STAR holdings will use the STAR. index. There are ‘two 


situations, however, where. the common holdings tech- . 


` nique will be utilized. The first is applicable to smaller ` 


` indexed by someone else. This would be done by typing, - | 


divisions who have in their holdings-only a small quantity 


. of reports which are indexed in STAR. A program has. | 
been designed which will allow a division in this situation `>- ` 
to. submit a skeletal input for any document which is, ог. _ 


will be in STAR. The program will then select the citation ` 


‘for that document from the STAR tapes, resulting in a. | 

 reductión in the indexing effort required. The second use | | 
will be where information оп a document is required. in | 

. the division files so that it can be used as part of their. . 

circulation control and/or inventory system. By use' oft ^ c 

'. the skeletal input, a division can economically augment ` 


its Блез во that information on all of their holdings are ` 


- available for those and рше types of bran administra- 


tive control systems. - | ; Sul 


ө Use of the Common Holdings. douane dn | 
Future Systems : 


Many of the proposals for' future йип eystems 
include the use of terminal devices in libraries or in- . 
formation centers which will be used for remote inquiry: 
of mass-storage units. Developments in the software 


` field indicate that in the near future we can also expect ^ 


to have workable and ‘economical’ automatic indexing . 
systems. Assuming: that remote inquiry devices were‘ . 
utilized, the common holdings technique can still be used 
to advantage. For example, when .a North: ‘American 


` division received a document it would first query. the (C 


mass-storage unit to see if the document had already been 


in & skeletal input. If the system "answered back" that | 
the document was already indexed, the inquiring division : 
would merely add its accession number as an input to the 
mass-storage index file: Thus, the same basic approach 


‘will continue to be applicable, to. decentralized informa- 


tion systems as these systems evolve -by adopting 1 new 
езше and арш i 
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| Brief Communication 


d The SLIC Index 


Тһе principal weakness of conventional indexes is that, 
though there need be no limit on the complexity of the 
subject designations used, there сап be great difficulty in 
matching those designations when & search is made. The 
indexer may assign to а document every term which the 
searcher can conceivably use, but the search requirement is 
likely to be specified by only some of those terms. Because 
the indexer has used terms with which the searcher is not 
concerned, the entries relevant to the search are scattered 
throughout the file, To take a simple example, the searcher 
requiring information on “the stability of single-engined 
aircraft” may be faced with the problem of extracting 
relevant items from a file consisting of entries such as: 


‘AIRCRAFT, EXECUTIVE, SINGLE-ENGINED 
| — Economics 
‘AIRCRAFT, EXECUTIVE, SINGLE-ENGINED 


аб у 
BOMBERS, ЈЕТ PROPELLED, SINGLE-ENGINED 


.. — Control 

BOMBERS, JET PROPELLED, SINGLE-ENGINED 
—Landing—Stability 

BOMBERS, JET PROPELLED, SINGLE-ENGINED 


— Maneuvers 
_ BOMBERS, JET PROPELLED, SINGLE-ENGINED 
_ —Stability 
: BOMBERS, CARRIER-BORNE, SINGLE-ENGINED 


|. — Control 
| BOMBERS, CARRIER-BORNE, SINGLE-ENGINED 


| 0 — Desi 
BOMBERS, FOUR-ENGINED — Take off — Stability 
FIGHTERS, SINGLE-ENGINED — Stability · 
FIGHTERS, SINGLE-ENGINED — Vulnerability 


Concept coordination systems solve this problem com- 
pletely. The searcher using ап optical-coincidence card 
system would stack the cards chosen for his search, and the 
fact that other terms had been used by the indexer would 
not affect the success of the search in any way. 

It is generally believed that the solution to the problem 





‚ав it affects conventional indexes is to permute all the terms 
‘in a given heading. (The word “permutation” is used in its 
,mathematical sense here. KWIC and like indexes are not 
‘permuted indexes, but are “rotated” or “cycled.” 


More 
properly called permuted indexes are those using the Uni- 
versal Decimal Classification, where class numbers are 
arranged in different. orders, using the colon.) Permutation 
undoubtedly provides for retrieval by citing any combina- 
tion of terms in any order, for the multiplicity of entries 
must necessarily cater for every possible approach. It is, 
however, not only extravagant but quite unnecessary. Our 
requirement is to provide for retrieval when any combina- 
iton of terms is used by a searcher, and it is combinations 
in the mathematical sense, and not permutations, with 
which we are concerned. We need to provide in our visible 
index every combination of terms from the total number of 
terms assigned to a document by the indexer. If « is the 
number of terms assigned by the indexer, we must provide 
entries consisting of every combination of 1 from т, every 
combination of 2 from n, every combination of 3fromn... 
every combination of n from n (i.e, «Ci + „С + „Сз... + 
nCn). This ean be expressed simply as 2" — 1. We have not 
taken into account the question of the order in which the 
terms are to be cited, but in fact the index can function on 


the basis of combinations only if alternative orders are rc- 
jected (which means rejecting permutation) and а fixed 
order of terms in each heading is adopted. The obvious 
order for an index using ordinary language terms is alpha- 
betical. ) 

Let us suppose that ап indexer assigns four terms — А, В, 
C and D—to & document and that we use alphabetical 
order as our standard order of terms in each heading. The 
entries which he needs to make in order to provide the 
facility for concept coordination in an ordinary visible file 
are as follows: у 


1. А 9. B 

2. АВ. 10. ВС 
8. АВС 11. BCD 
4. ABCD 12. BD 
5. ABD 13. 

6. AC 14. CD 
7. ACD 15. р 

8. AD >: 


If the same terms were used in an optical coincidence card 
System апа а search was made for, say, the subject repre- 
sented by the terms B and C this document would be 
retrieved though it has the additional terms A and D. It is 
clear, then, that а search for BC in a visible file which 
contains the entries listed above would be satisfied by the 
one entry BCD (No. 11) and that the entry BC (No. 10) is 
superfluous, for any entry consisting of or beginning with 
the sought terms is relevant to the search. It is therefore 
possible to reduce the number of entries required in a visible 
file still further, for all entries which form the beginnings of 
larger entries can be dispensed with. Thus, in the list above, 
Entries 1, 2, 3, 6, 9, 10, and 13 can be deleted and the re- 
maining entries are as follows: 


бога о роо 
ыз 
Q 
9 


This is the absolute minimum number of entries required 
and the total is now reduced from 2" — 1 to 2-?, i.e., from 
15 to 8 in this case. Table 1 shows а comparison of the num- 








TABLE 1. 
Хо. of terms 
assigned to АП Selected 

document by Permutations combinations combinations 

indexer (n!) 2—1 26-9 

2 2 3 2 

3 6 7 4 

4 24 15 8 

5 120 31 16 

6 720 ` 63 32 

7 5,040 127 64 

8 40,320 255 128 

9 362880 ` 511 256 

10 3,628,800 1,028 512 
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ber of entries produced by the three methods: permutation, 
the 2" — 1 formula, and the 24-5 formula. 


It is clear that the number of entries required in a visible - 


file to give this facility is still too high to make conventional] 


indexing on the usual card index principle a viable proposi- · 


tion if the average number of terms used for a document 


' is high, for the labor of producing and filing so many entries . 


would be excessive. The index which has been produced at 
ICI Fibres Limited does in fact use the formula 27^? but 
the principle of using an accession number to identify docu- 
ments has been used, as with an optical coincidence card 
system, and the entire process of generating the appropriate 
entries from the group of terms used by the indexer, and the 
printing of the mdex, has been assigned to an IBM 1401 
computer. The maximum number of terms which can be 
assigned to specify a subject has been set at 5, which means 
that such a subject receives 16 entries in the index. This 
limitation on the number of terms is the one disadvantage 
of the system as compared with an optical coincidence card 
or any other type of coordinate system, but any number of 
5-term sets can, of course, be assigned to a document. Within 
these limits, concept coordination 18 as positive as with any 
other system. Because the index lists combinations which аге 
а selection from the total number of possible combinations, 
it has been dubbed “The SLIC Index” — Selective Listing 
In Combination, The following is a description of the practi- 
en] production of the index. 

Documents are entered in an accessions register in order 
to assign to each a unique number, ав for an optical coin- 
cidence card system. The same vocabulary is used as is used 
with the optical coincidence card index (ICI Fibres has such 
a card system using the French “Selecto” equipment), ex- 
cept that no term of more than 14 characters is acceptable 
because of the limited field sizes used on the IBM cards. 
The indexer assigns to a document а group of terms, not 
exceeding five in number, and these terms he enters on an 
indexing slip in alphabetical order, together with the ассез- 
sion number of the document, e.g.: 


AIR: COOLING: CRYSTALLIZING: 
SPHERULITES: TEMPERATURE 1394 


This information is punched into а standard ІВМ 80-column 
сата which is divided into 5 fields of 15 columns each and 
one of 5 columns. The 15-column fields each take а term and 
the 5-column field takes the accession number. 

These “primary” punched cards are fed to the IBM 1401 
computer which is programed to reproduce the “derived” 
cards forming the additional entries according to the pre- 
scribed formula, and this it will do whether a primary card 
has been given 5, 4, 3, or 2 terms by the mdexer.. The 
entire file of cards is sorted into strict alphabetical order 
and fed again to the computer which prints the index in 
the designed format. The computer is programed to print 
a given combination of terms once only and to subsume to 
that combination all the document numbers which are rele- 
vant. A sample of the finished index is appended (Fig. 1). 

When а search is made. the searcher selects from the 
vocabulary those terms which he feels best describe his 
subject, as though he were going to use a set of optical coin- 
cidence cards. He jots down the terms in alphabetical order 
and consults the index at the point where this combination 
of terms appears. He takes into account all entries which 
comprise or begin with his chosen group of terms, notes the 
accession numbers, and consults the accessions register to 
identify the documents. He may find cases of the same num- 
ber appearing under several headings, but once he has 
` encountered the number he ignores all other citations. 
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Appendix 


Simple Method of Determining Combinations To 
Be Used As Entries | 


The terms assigned by the indexer (a maximum of 5 in 
our саве, which we will call A B C D E) are arranged in 
alphabetical order. The first term is then set down alone, 
and thereafter the following pair of rules is applied re- 
peatedly until all the terms have been used: 


1. Add the next term to all existing combinations. 
2. Repeat all the combinations so formed, deleting the 
penultimate term in each case. 


Зе пр the first term down alone, we have: 
A 


Applying Rule 1 we convert this to a new combination by 
adding the next available term which is B: 


AB 


Applying Rule 2 we remove the penultimate term. which 
is A, to form а new combination, which is B alone. We now 
have combinations: 


AB 
B 


Rule 1 gives 


A B C and B C and Rule 2 adds the new 
combinations À Са 


an 
nd C. Total combinations are now: 


Rule 1 gives A B C D. B C D. АС D, and C D; Rule 2 
adds A B D, B D, A D, and D. Total combinations are 
now: | 


ABCD ABD 
BCD BD 
ACD AD 
CD D 
Rule 1 gives ABCDE,BCDE,ACDE,CDE, 
À B D E, B D E, A D E, and D E; Rule 2 adds ABC E. 
BCE,A C E, C E, A B E, B E, A E, and E. The final list 
of combinations is 
ABCDE ABCE 
BCDE ВСЕ 
ACDE ACE 
CDE CE 
ABDE ABE 
BDE BE 
ADE AE 
DE E А 
Arranged in alphabetical order, the list becomes: 
ABCDE BCDE 
ABCE BCE 
ABDE BDE 
ABE BE 
ACDE CDE 
ACE CE 
ADE DE 
AE E 


It is interesting to observe that Venn diagrams, which 
are often used to demonstrate the prineiple of concept 
coordination, ean be used to illustrate the principle of the 
SLIC index. In Fig. 2, three circles are used to represent 


KNITTING 


MEMORANDA INDEX 


WRAPPING 
YARNS 


30 DENIER 


YARNS 


PAGE 93 


69 101 122 164 763 772 885 886 945 979 1113 1151 1173 1181 1268 1379 1396 1411 1452 1565 1620 1664 


WARPING 
1345 
KNITTING WARPING 
101 763 945 1181 1452 1565 
KNITTING WARPING 
901 
KNITTING WEAVING 
1370 1601 
KNITTING WEAVING 
763 
KNITTING WEFT 
1391 1672 
KNITTING WELDING 
7 
KNITTING WELTS 
1398 
KNITTING WORSTED 
1272 
KNITTING WRAPPING 
1345 
KNITTING YARNS 
KNITTING YARNS 
1516 
KNITTING YARNS 
727 1616 1620 
KNITTING 6 DENIER 
1516 
KNITTING 15 DENIER 
727 956 959 1371 1374 1616 1620 1631 
KNITTING 30 DENIER 
901 
KNOTS POLYMERS 
. 1454 
KNOTS POLYMERS 
1454 
KNOTS POLYMERS 
1454 
KNOTS POLYMERS 
1454 
KNOTS RATIOS 
1387 
KNOTS RATIOS 
1454 
KNOTS RATIOS 
1387 
KNOTS RATIOS 
1454 
KNOTS ROPES 
KNOTS SLIPPING 
KNOTS STRENGTH 
1438 
KNOTS STRENGTH 
1438 1454 
KNOTS STRENGTH 
1454 
KNOTS STRENGTH 
1454 
KNOTS TENACITY 
1387 
KNOTS TESTING 
1438 
KNOTS TWINES 
1438 1454 
KNOTS TWINES 
1454 
KNOTS VISCOSITY 
1454 
KNOTS WETTING 
1438 
KRALASTIC SLEEVES 
1172 
KRALASTIC SLEEVES 
1172 
KRALASTIC SLEEVES 
119 1172 
KRALASTIC SLEEVES 
119 1172 
KRALASTIC SNATCHING 
1172 


6 DENIER 


15 DENIER 


STRENGTH 
STRENGTH 
TWINES 
VISCOSITY 
ROPES 
STRENGTH 
TENACITY 
TWINES 
TENACITY 
WETTING 
TESTING 
TWINES 
TWINES 


VISCOSITY 


TWINES 


VISCOSITY 


SNATCHING 
SNATCHING 
TAKE OFF 
TENSION 
TAKE OFF 


TWINES VISCOSITY 
VISCOSITY 


VISCOSITY 


TENACITY 
TWINES 


TWINES 


VISCOSITY 


^ TAKE OFF TENSION 
TENSION 


TENSION 


TENSION 


Fio. 1. Sample of finished index. 


American Documentation — January 1966 


44 


THE . = 
INFORMATION 
RETRIEVAL | 
LETTER 


Edited by Lowell H: Hattery- 


For the first time, now, 
you caü receive regu- 













INFORMATION RETRIEVAL 
гиттин 


= ги. "SET E лз а х= 








information retrieval 
management. 
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organization must be 
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els of management, in 


ational personnel. per- 
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~ any number of terms and а 4-term 


lar monthly reports on 


The scientific infor: ; 


addition. to the орег-' 


Technical information is^ 


-threc terms: А В, and C. The several areas created by the 


circles are numbered 1-7. It can be seen that the total num- 
ber of areas is equal to the total number of combinations 


26; 2^ — 1, i.e., 2 —1— 7. The selected combinations: compris- 


ing the entries in a SLIC index using 3-term sets аге rep- 
resented by all those areas which fall within the last-named 
circle, i.e., Circle C. These areas (4, 5,.6, and 7) are fourin . 
number: '9®-® = 999 —4 This principle can be used. for :. 


combinations listed іп ће body of this paper. It is, unfor- ` 
iunately, impossible to use simple diagrams for more than , 


у 4 terms. 





/ 
Fia. 2. Principle of the SLIC index. 


am would show the. : 
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| Letters to the Editor 


Dear Sir: 


- In!spite of criticism and the pressures of suggested alter- 
natives, scientific апа technieal papers, issued in serial 
format, are likely io remain the optimum technique for 
publication of most of the new research and development 
results, state-of-the-art and review presentations (1). Data 
suggests further, that as the number of such papers grows, 
repackaging into narrower segments through the splitting of 
old journals and the founding of new ones creates surges in 
circulation which indicate the appropriateness of this tech- 
nique in enhancing the viability of the scientific serial. 

Crities of the scientific serial condemn its distribution of 
too many pnpers of too little interest to any one reader, 
and propose complete disintegration of the mechanism 
through the establishment of separate distribution systems 
relying on demand orders from scientists who would select 
suitable papers from widely circulated title lists or abstract 
journals (2). This is the technique used to a considerable 
extent in the distribution of scientific and technical reports, 
and has not proved wholly acceptable. For one thing, it 
requires the reader to switch from the passive to the active 
mode in the dissemination system, and to operate this way 
continually. 

Critics of the separates distribution proposals cite as 
advantages of the serial its process of preselection for the 
reader, its compensation in terms of prestige for editors 
in return for their considerable efforts, and its suitability to 
the presentation of advertising which allows publishers to 
tap this significant source of funds (3). 

hat is required is a system which would minimize the 
disabilities of both the separates and the serial mechanisms 
while retaining the principal advantages of both. Here, in 
brief, is a proposed system which may serve. In summary, 
it is a suggestion that serial publication be retained, but with 
varying groupings of papers within each series title, tailor- 
made to match smaller segments of readers than ever before 
through the use of selective dissemination of information 
(4) and automated typesetting and sorting techniques. In 
the proposed system, for any one journal: 


1. Papers would continue to be written, refereed, and 
edited as at present. 


12. Papers would be classified by subject to several levels 
of specificity using а limited number of terms (descrip- 
tor or microthesaurus concept). 


3. The text of papers and identification numbers for each 
paper would be stored on paper tape. Identification 
numbers of papers and the subject terms describing 
each would be stored in computer memory. 


4, Subseribers would be given identification numbers and 
would prepare profile descriptions, also to be stored 
in computer memory. 


5. The computer would be programed to process its store 
&t predetermined economic intervals. Тһе output 
would be а list of subscribers’ identification numbers, 
the identification numbers of the papers matching each 
subseriber's profile, and а tabulation of the number of 

| copies of each paper required. 

| 6. Тһе tape containing the texts of articles, coupled with 
, the print order for the number of copies of each paper 
required, would “drive” the typesetting and printing 
systems. 


7. Papers would be automatically sorted into packets, 
each packet corresponding to the individual sub- 
scribers’ profiles, as indicated by the printout in Step 5. 


S. Advertising, news and editorial sections, and a title 
page and masthead would be "wrapped around" each 
ха ; the entire packet would be covered, glued, and 
mailed. 


The computer could be used for other control and service 
functions. If the number of papers for any one subscriber 
were too few, according to а predetermined economic issue 
size, the computer could select the appropriate number of 
additional papers from а more general level of specificity 
in the subject file. If too many papers were listed for an 
issue to a subscriber, the computer could provide a cutoff 
mark at the economic limit and provide a printed list of 
the titles of the remaining papers which could be sent to the 
subscriber who could either order full-text copies as sepa- 
rates or wait for his next journal issue when the full text 
would be scheduled to appear. Frequent occurrence of over- 
selection or underselection in this part of the program 
would be an indieation that the journal issue size should 
be changed for the next subscription period. 

. Hach author could be provided with a list of the names or 
identification numbers of subscribers who had received his 
paper in order to be aware of his “audience.” 

Libraries would subscribe to serial sets of all papers to be 
filed under journal title by paper identification number. 
Library resistance to separates handling could be overcome 
by packaging papers into serially numbered groups, Bib- 
liographical citation standards would require the use of the 
journal title and paper number, The paper number could 
be designed to indicate the year of publication or the 
volume number. Abstracting and indexing could take place 
as soon as a paper was selected for publication. 

Selective dissemination would allow a greater number of 
different papers to be included in the journals than is now 
the case. Steps 6 through 8 would require technical develop- 
ments in materials handling. There may, however, be other 
automated processes for serial publication, using alternative 
configurations of hardware, not here considered, which 
would give the same desirable effect. 
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essing Equipment. 19 pp. ASDD, IBM, Yorktown 
Heights, N. Y. 

RUSSELL SHANK 


Sentor Lecturer 
Columbia University 
New York, N. Y. 
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Dear Sir: 


The Golden Rule, as interpreted by publishers, seems to 
require them to avoid inflicting information on any seeker 
who might have to use it. By extension, we must never allow 
anyone to find information for which he may have to go 
through the motions of searching. 

The Yellow Corallary holds that an author is ill-advised 
to make his sources findable, since someone may check up 
on him. 

A few hateful scabs may actually wish to avoid bib- 
lographie suicide, and run the risk of having orders come 
in from the book-hungry public. This may interrupt the 
publisher’s poetry writing, and it may change his income- 
tax bracket, but some adventurous or acquisitive souls may 
be willing to face it. 

The accompanying checklist can be used by either group, 
as a guide to bibliographic suicide or as a chart showing 
the rocks and shoals through which a newly-launched or- 
ganization or publication may have to navigate. 

This effort is not so much intended to discourage biblio- 
graphic suicide as to make it more efficient, where it is 
сЕ, ог to permit people to &void it, if they аге so 
inclined. 


Bibliographic Suicide-Guide: For Publisbers and Authors 
GENERAL RULES: 
1. Get people HEGGS! HEGGS is “Eggs-aspirated.” 


2. Give people ASININE! ASININE means "Alphabet 
Soup of Initials with No Interpretation for the Non-Expert." 


3. Start your title with deadwood words. Journal of the 
... Offers so many more filing complications than, say, 
Steel Research Journal. 


4. Make sure that the initials of the name af uour grou 
or organization are embarrassing enough so that you wi 

have to change to something and complicate life nicely for 
everyone. The Society for the History of Operative Tech- 
nology is ideally worded from this point of view. After a 


few years you can change it and get off all those pesky mail- 
ing lists. 

5. Start your title with initials. The ASTM Bulletin is a 
classic of unfindability, especially after the clever change 
from American Society for Testing Materials became the 
American Society for Testing and Materials, which put 
everything just subtly out of place. The IRE Transactions 
on ... and the IEEE Transactions on ... are beautifully 
unfindable, since no one can guess whether to find them 


under the letters as one word or as though each letter were 
a separate word. 


6. Start your title with an easily misspelled word. Haema- 
tology Abstracts would do nicely, since almost all Americans 
would look for hematology and the Britishers would look 
for the other. Right there you have managed to sidetrack 
half of the searchers, or prospective subscribers. 


7. Use confusing opening words. Committee and institute 
аге good, since no one can guess whether the abbreviation 
should be reconstituted ав Committee or Commission, or 
Institute or Institution. 


8. Name the issuing body in the title of your journal. 
Journal of the Oobleck Research Society is even better than 
Oobleck Research Society Journal, because starting with 
Journal throws off the people who might look for it under 
J rather than O, but it throws them further off. 


9. Above all: Never give a searcher an even break. 
McGurk’s Law holds that “Whatever would maximally foul 
things up, is maximally likely to happen.” Make sure that 
your publication is absolutely unfindable or you will have to 
stop long enough to reconfuse the book-starved publie which 
gets througb to you. 


Бовевт L. Віксн 

Sctence Index Group 

8108 Dashiell Road 

Falls Church, Virginia #2042 


Erratum 


American. Documentation, Vol. 16, No. 4, October 1965, page 340: 
The letter concerning Chemical Abstracts Services and Chemical Titles 
is erroneously attributed to G. Salton. The writer is Dr. F. A. Tate, 
Chemical Abstracts Service, Columbus, Ohio. 
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. Book Reviews 


1/66-1R. А Directory of Information Resources in the 

United States: Physical Seiences, Biological Sciences, 

Engineering. 1965. (Issued by the Library of Congress, 

National Referral Center for Science and Technology). 

i j 8. Government Printing Office, Washington, D. C. 
рр. 


Ik is difficult to appraise a compilation of this kind, 
particularly when personal biases and prejudices are likely 
to color the reviewer's attitudes. But an objective evalua- 
tion of this book, after four months of actual use and test- 
ing (and some correspondence about particular collections), 
leaves one with a sense of the volume’s potential helpful- 
ness, especially when used with other similar tools — which 
is only to say that there is considerable difficulty in main- 
taining & complete record of organizations in the ever- 
changing fields of the biological and physical sciences and 
engineering. 

Alphabetically arranged by name of organization, this 
directory gives addresses and telephone numbers, the latter 
& particularly useful feature, as anyone will know who has 
tried to locate a project of a subdivision of a subdivision of 
a university department with the aid of a long distance 
operator and the assistance of the university's telephone 
personnel. For each entry there is a descriptive note, of 
varying detail and length, which tells the scope or interests 
of the organization, but nothing to suggest its size; also, 
the quality of these descriptions varies considerably. Often 
there is a statement telling that book, journal, document, 
or report collections are maintained, although these are 
seldom qualitative evaluations und almost never indicate 
quantity (that is, size of the collection). Misleading or 
ambiguous statements — the mention of в collection — sug- 
gest that these materials are in some ordered arrangement 
and accessible. Interlibrary loan or photocopying policies, 
or fules about the availability of materials for the use of 
outsiders, are generally indicated. The imperfection of all 
these descriptions may lie, of course, in the character of the 
questionnaire, or in editorial revision of the returns, 

The boox’s index is very useful because of the specific 
subject headings that have been used, but because there 
are too few headings and indexing has not been full enough, 
many possibilities are missed. For example, Ultra-Violet 
Products, Inc., of San Gabriel, California, says that “Books, 
journals, and reports are collected on ultraviolet and black 
light and fluorescent materials and related equipment. 
Chemicals, inks, and additives are manufactured for use 
with ultraviolet lamps.” An entry like this might make an 
indexer’s fingers itch and the Pis fly, considering that 
the' book is intended to be “A Directory of Information 
Resources,” but the only entry Геге is under ULTRAVIOLET 
and other possible terms either do not include this organiza- 
tion, or do not appear at all in the index. 

My chief complaint about this book and several others 
like it is the a priori method of compilation. As the “Fore- 
word” says: 

This directory has been drawn from the central register 

of information resources being built up on a continuing 

basis by the National Referral Center. Because that 
register is still evolving and growing rapidly at the time 
of publication, the directory itself must be considered as 

a preliminary and exploratory effort rather than as a 

properly comprehensive guide, or even a properly selec- 

tive one. Many resources of known significance have been 

omitted, for lack of data or other reasons; some resources 

of uncertain value have been included because of that 

véry uncertainty; descriptions of services and functions 

are frequently less clear than could be desired, for lack 
aa 
| 


of definitive terminology. These and other constraints 

must await future correction. 

Apologetic explanations of this kind are acceptable but 
are of little help to the researcher who is in need (and how 
frequently we are told that scientists must have their in- 
formation immediately). Of the private associations or 
corporation “libraries” listed here, I have personally in- 
vestigated several that I had not known about before, 
and the “information resources” I saw were certainly not 
libraries in any sense of the word, nor were some of them 
any other kind of fount of information! In most cases there 
were no regular book collections — indeed, few had more 
than a half-dozen shelves of common textbooks — and there 
was no semblance of a catalog of any internal resources 
such as books, vertical file materials, ete. There were, in 
some cases, technical information reports of various series, 
but these were seldom unusual, unique, or really selective 
by content (though often only the “latest” reports are kept). 

As for the special information that ts available, it is either 
considered restricted information (by reason of government 
contract or “only for our own staff”), or it is in the head 
of one or two scientists or executives whose time or willing- 
ness to share their knowledge is limited by various factors. 
Further, one infrequently finds a professional librarian, 
documentalist, information scientist, archivist, or any other 
kind of specialist in charge of the arrangement, mainte- 
nance, ог care, of these information resources, There is, 
rather, considerable chaos and very little sophisticated 
organization of special files or book collections in many 
of these places. Even when specialized series of documents 
are received and have a built-in, easy-filing classification 
and notation system, the filing is put off and the materials 
remain scattered throughout the organization. 

This book, then, can be looked upon as & preliminary 
directory of some kind. It is certainly not a guide to any 
kind of library resources —and it is not meant to be, 
really — but one wonders whether it can actually be of 
much service to the people it is made for, even if 1t should 
appear іп a more D Тег есі form in later editions. 
One thing is certain, it will help all kinds of people increase 
their mailing lists for one reason or another. 

Of course one feature of the index is likely to be helpful, 
even if only by chance: instead of the usual interminable 
citations to page numbers (usual in books of this kind), 
the names of organizations are given; in this way, by 
scanning a list under a subject one might possibly find the 
name of an organization where a former employee, a college 
chum, or a conference drinking companion works. This may 
be especially helpful because so much specialized, confi- 
dential, and even restricted information is passed on through 
the buddy-buddy association of “contacts.” In any case, 
users should remember again that this is not a directory 
of libraries but of organizations, meaning that inquiries 
may not always be received or answered with the service- 
oriented amiability we are used to finding among librarians. 

As the compiler of another kind of guide to library 
resources, the reviewer recognizes that he and the National 
Referral Center people have many problems in common. 
It is his ardent hope that the NRC will, by constant effort 
and careful study, come up with some easy way to describe 
resources accurately and to appraise the willingness and 
ability of organization personnel to share their stored 
information. 


Les Авн 

Comptler, Subject Collections: 

A Guide to Special Book 
Collections and Subject 
Emphases 


American Documentation — January 1966 47 


/1/06-2R Author-Title Catalogue. Subject Catalogue. 
April 1965. Toronto University Library, Ontario New Uni- 
versities Library Project. 2 vols. 


Catalog of Books Plus a Complete Catalog of 
Reserve Books. 1965. Washington University School of 
Medicine Library, St. Louis, Mo. 267 pp. 


In recent years, advances in technology have begun to 
make book catalogs in libraries & reasonable alternative 
again for the first time since the ndvent of Library of 
Congress printed cards. Ав а result, a number of libraries, 
new and old, have begun to experiment with catalogs in 
ihis format. 

Тһе Ontario New Universities Library Project (ONULP) 
and the Washington University School of Medicine Library 
catalogs аге two recent examples of this trend. Both are 
compiled on the computer and use IBM printouts, and both 
nre divided, but there the resemblance enda. 

Тһе ONULP project is intended to provide five new 
universities with identical basic collections of about 40,000 
volumes by 1967. The book catalog represents only these 
collections and excludes other materials which the libraries 
will acquire. This catalog is being published monthly, with 
quarterly and semiannual cumulations and annual total 
cumulations. It uses а specially-developed 120-character 
upper and lower case print chain with diacritical marks, 
and is divided into two parts: author-title and subject. 
The format is attractive and the 42% reduction ratio 
produces quite legible copy. 

The catalog is compactly arranged, without much wasted 
space in the author-title catalog. Cataloging information is 
given in full in the main entry but is abbreviated elsewhere. 
However, there are three instances where some wasted space 
occurs which might bulk large when the catalog reaches its 
projected 1967 size. Filing titles are used in the Shakespeare 
eutries and appear in the nrintout. Since we are dealing 
with a union catalog, the line devoted to location designa- 
tions for each entry appears reasonable at first, but the 
preface explicitly states that only the identical basic collec- 
tions are included, and everv entry turns out to have the 
location symbol for all five libraries. 

In the subject catalog there are almost ns many cross 
references as there are subject headings with entries under 
them. Many, if not most, of these could easily have been 
omitted: on the first page the entry. Абайата, Pierre, 
7097-1148, ін surrounded by cross references from five other 
forms of the name. There are no intervening entries. This 
ense, while extreme, is not atypical. 

The arrangement is entirely by computer, using this order 
of soris: blank, period, dash, comma, À through Z, 0 through 
9, Тћив far, provision has been made for disregarding initial 
articles in title entries, and qualifiers such as ed. in name 
entries. 

The clerical portion of the work is very well performed: 
there are no peculiarities of spacing to lose entries or make 
them fall into two files. However, the mixture of open and 
closed author’s dates, and the evident failure of the com- 
puter to disregard ed. when it follows a person’s dates, might 
well cause problems when the catalog has reached ils 
projected size. 

The practice of filing on punctuation was probably 
followed because it does solve some difficulties, particularly 
in subject entries. However, it must require extra care in 
catuloging and punching and produces some peculiarities. 
For instance, in all subject headings which are subdivided, 
the punctuation mark is both preceded and followed by 
a space: 

Я Meteorology , Agricultural 
Athens. Theater of Dionysus 
Art ~ Africa , South 


The ONULP catalog certainly did not create the problem 
of subject headings, but it demonstrates them beautifully. 
The entries under Gt Brit- History, shown below, are a 
mixture of form and dated and undated period subdivisions 
led in & strict sort order bearing no relation to the separa- 
tion of form and period divisions, with chronological 
arrangement of the latter, which is usually followed. 
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Сі Brit - History 

- Addresses, essays, lectures 

- John, 1199-1216 

- Medieval period, 1066-1485 

- Stephen, 1135-1154 

-'To 1485 

- Tudors, 1485—1608 

- Tudors, 1485-1603—Sources 

- Victoria, 1987–1901—А ddresses, essays, lectures 

- 16th century 

- 1689-1714 

- 1714-1837 

- 1760-1789 

- 18th century 

- 20th century 

, Economic 

, Local 

, Military 

, Political 
These headings now occupy about 2% columns in a sub- 
ject catalog of just under 2,000 entries. When the libraries 
have reached their projected 1967 size of 40,000 volumes, 
arrangement of subject headings will be a major problem 
if their form is not changed before then. 

As in any human effort, the ONULP catalog has its share 
of typographical errors (but not very many) and spots 
where the gods nodded and forgot that computers are logical 
beasts. But to detail these would be hairsplitting, to no point. 
The catalog will fulfill most adequately the purpose for 
which it is intended, and the introduction makes evident 
the awareness that it is a first, experimental effort, and that 
changes are planned, 


The Catalog of Books Plus a Complete Catalog of Re- 
serve Books of the Washington University School of Medi- 
cine Library is not that at all. It is a catalog of accessions 
in the Library from January 1 through September 1, 1966, 
consisting of 1,050 items. The most unfortunate thing about 
it is the statement in the preface that the Library is not 
planning to continue the printed book catalog. While it 
18 possible that the reasons given for this decision might be 
valid (production costs and lack of demand for the pre- 
viously-produced, semiannual, cumulated serials holdings 
lists), this catalog certainly will not provide a fair demon- 
stration because it covers only a very small percentage of 
the available material. 

The catalog is quite valuable us an experimental demon- 
stration. It is part of a computer-based system set up at 
the Library to make one input manipulable for numerous 
outputs, from acquisitions to cataloging. The results of 
the experimental work have been well reported in the 
literature, including costs, so need not be reviewed here. 

The catalog is divided into four sections: Author and 
ndded entry, Title and series title, Subject, and an author 
listing of the Reserve collection. (The last of these, sepa- 
rately published, might have provided some idea of the 
real demand for a book catalog.) 

The prefatory material discusses several problems that 
arose during the making of the catalog, and that are present 
in this version. Among these are an oversight, the result 
of which was that in the author and subject catalogs many 
of the entries by the same author are not subarranged by 
title at all, and the remainder are alphabetized only by 
the first letter of the title. In most cases this is a minor 
problem, but in some of the U. S. entries, with up to 19 
items under & single nuthor, not subarranged at all, the 
story is different. 

Another peculiarity is the substitution of cross references 
for added entries in the author file. Usually this is not a 
mejor problem, but when reference is made to an author 
heading under which there аге several entries the entire 
group must be scanned to find the relevant title. 

An article describing the projected catalog in the Bulletin 
of the Medical Library Association states that the subject 
headings were coded for arranging purposes. However, it 
is dificult to imagine what coding procedure could have 
resulted in an entry under paronosis, filed between СУТОХАХ 
and DARWIN, plus two entries under DIAcNosis; between 
DIABETES MELLITUS, JUVENILE and DIARRHEA. 


ope бш. езе, eM ETIE PES 


Öne serious mistake was use of the direct printout without 
reduction in size, so that the catalog, which totals about 
4,000 entries, is 267 pages long, 8%” x 11". The subject 
catalog, with 113 pages for only 1,235 entries, is the worst 

offender, 

This catalog is admittedly experimental. The conclusions 
reached as a result of its production, and stated in the 
prefatory material, could well have made possible the 
subsequent production of an improved, more complete 
catalog that would have served a real purpose for users 
of the Library, thus providing a fairer test of the system. 
It may be hoped that the Library will reconsider its posi- 
tion and attempt a more complete catalog in the future, 
especially since all the daia will be on tape anyway. 


Jessica L. Harris 
Rothines Associates 


1/66-3R. Тће Education of Science Information Per- 
sonnel — 1964. 1905. A. J. Goldwyn and A. M. Rees, Eds. 
Westem Reserve University, Cleveland. 115 pp. 


This book presents the proceedings of a two-day con- 
ference held in July 1964, It consisted of five sessions, the 
crucial one being the first: 

1. Summary statements by 17 colleges and universities 
describing their programs, approaches, and attitudes con- 
cerning the education of science information specialists 
and information scientists. Of the 17, 14 are library schools 
and 3 are not, Of the 17, 9 have no active rogram beyond 
a single course in “Documentation” or “Information Re- 
trieval,” 4 have emphasized some kind of documentation 
program designed for information specialists, and 4 have a 
defined pro s in information science as a more or less 
theoretical discipline. 

2. Three papers on the general topics of manpower and 
resenrch. The first two papers, on manpower, were presented 
by Robert Kohn and William Hitt of Battelle, and em- 
phasized their study of the needs for and use of manpower 
1n engineering and the natural sciences. Mr. Kohn discussed 
the intent and approach; Dr. Hitt presented the methodol- 
ову | ара 10 fundamental questions to be answered (What 
is the field? The job function? Routes of entry? Character- 
кы of personnel? Skill shortages? Educational needs? 

* 

3.!À presentation by Stafford L. Warren of his proposal 
for a Library of Science System and Network, which would 
use Medlars as a starting point. 

4. À series of workshop reports on students, faculty. 
curricula, and academic organization. The magnitude of 
the problems i in these four areas is so great that it is a pity 
the workshops did not produce more than the limited 
results reported in this section. However, these results reflect 
the limitations of conferences more than the interest and 
capabilities of the participants. Such workshops may repre- 
sent useful educational experiences for the participants; 
reporta of them rarely are useful, and these are no exception. 

brief summary by А. J. Goldwyn. 


aE results of this conference, in comparison with its 

predecessors at Georgia Tech., reflect the extent to which 
curricula have been formalized throughout the United 
States. The last few years have scen progress in at least 
three nreas of educational programs: instruction in library 
automation, education of information specialists, and devel- 
opment of "research programs in information science. The 
reports in this book show the magnitude of these develop- 
menis. 

Ковевт HAYES 

School of L4brary Service 

University of California а! 

Los Angeles 


1/66-4R Technical Dictionary of Librarianship, Eng- 
lish-Spanish. 1964. Beatriz Massa de Gil, Ray Trautman, 
and Peter Goy. Editorial F. Trillas, S. A., Mexico. 387 pp. 

In this work the authors set out to design & dictionary 
"for librarians, students, editors, publishers, booksellers, 
archivists, and all others who work in the communication 
arts gr are interested in the technology of librarianship.” 





Since there is a scarcity of reference books of this nature, 
there is no doubt that the Spanish-English and English- 


Spanish. vocabulary of over 3,000 terms will be helpful to 


those for whom the dictionary is intended. Nonetheless, the 
prospective user should not become overly enthusiastic and 
expect to find much of the technical terminology which has 
appeared in the last few years. It is the traditional aspects 
of librarianship which are emphasized, perhaps at times 
to the point of redundancy. Words such as BIBLE, PARAGRAPH, 
PENOIL, BONG, and scIENCE, that can readily be located in 
other standard bilingual dictionaries could have been omit- 
ted to make way for RETRIEVAL, DESCRIPTORS, INDICATORS, and 
the many other terms and expressions that have developed 
with data processing. 

The arrangement of the material is Jaudable. It is divided 
into two parts: Spanish-English and English-Spanish. In 
each section the order is alphabetical word by word, The 
lexicographers have not merely translated each term from 
one language to the other, but have gone a step further by 
defining the vocables included. However, since the historical 
and/or geographical backgrounds of the words are not in- 
cluded, the reader should realize that there can be varia- 
tion in the meaning from one country to another. This 
is especially true in Spanish-speaking countries where 
regionalisms can persist as a result of the limited inter- 
change of professional literature. In Mexico, for example, 
encabezamiento de materia is the commonly” accepted des- 
ignation for “subject heading,” yet in Peru the term used 
is epigrafe de materia. 

All in all, this technical dictionary is a definite contri- 
bution to the field of library science. It should do much 
toward the systematization of its lexicography. It is to be 
hoped that more books of this type will be forthcoming 
in the near future. 

AnNULFO D. TREJO 

School of Library Service 
Гуо Caltfornta at 
Los Ange 


POSITION OPEN 


Chemical Information Specialist 


Diamond Alkali Company has need for a qualified 
individual to maintain the Research chemical and 
biological files. This individual will also correspond 
with outside testing agencies. Individual will -have 
responsibility for recording results and correlating 
chemical structures with biological data to present 
guide-lines for future synthesis. 


Interested candidates must have a good understand- 
ing of organic chemical structures and nomenclature. 
This position offers both interesting and challeng- 
ing assignments with opportunities for professional 


growth. 


Diamond’s modern Research Center is located in 
a campus type setting approximately 30 miles east 
of downtown Cleveland, Ohio. This is a growing 
community with the advantages of both suburban 
living and a large metropolitan area. 


Qualified candidates are asked to submit their 
qualifications for confidential review to: 


W. L. Abele 

Diamond Alkali Company 

T. R. Evans Research Center 
P. O. Box 348 

Painesville, Ohio 44077 


An Equal Opportunity Employer 
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New From 


WESTERN PERIODICALS CO. 


The Ladenburg-Reiche Function 
Tabulations created on the IBM 7094 at Rocketdyne Division of North 
American Aviation 

'Table of Exponential Functions 
Tabulations created on the ІВМ 7094 at Rocketdyne Division of North 
American Aviation 

Eighth National Symposium of The Society of Aerospace Materials and Process 
Engineers on “Insulation—Materials and Processes For Aerospace and 
hydrospace Applications” 

Wave Mechanics Of A Free Particle by E. Fisher 
Contents include The Fundamental Constants, The Neutrino, The Elec- 
tron, The Proton, Unstable and Interacting Particles 

The Computerman’s Dictionary 
A glossary of computer definitions and concepts 

Project ERIE: Development of an Analytical Model for Environmental Re- 
sistance Inherent in Equipment 

IEEE Mid-America Electronics Conference 
Volume 1—Electronic Systems Reliability, 1961 
Volume 2—Measurement and Instrumentation, 1963 
Volume 3—Measurement and Instrumentation, 1965 

'Thermodynamic Properties of Individual Substances 
Tables published by the Academy of Sciences of the USSR 


Exclusive Distributor 


WESTERN PERIODICALS CO. 


13000 Raymer Street, North Hollywood, California TRiangle 0-0555 
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Documentation Abstracts 


11 MAY 1988 


~... . 15 a joint publication under the г auspices. var 
of the American Documentation Institute and | 
the Chemical Literature Division of the 


. American Chemical Society. 


PX represents combined coverage of the 
former Literature Notes section of American: 
Documentation, the ACS Division of Chemical 
Literature Annotated Bibliography, and the 
former coverage of Documentation Digest. 


4245122 will issue quarterly — February, Мау, 
August, and November of 1966. Each issue wili 
contain corporate and author indexes; subject 
indexes will be available on a schedule to be 
determined. | 


Subscriptions are sold on a calendar year . 
basis — $8.00 per year." Return the coupon 
below. Payment with your order is requested: .. · 


* Members of the American Documentation Institute will receive the 
first year's subscription free. 


DOCUMENTATION ABSTRACTS—Please enter my subscription for one year 
commencing with the February 1966 issue. At $8.00 per year, payment is 
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uu Ru AMERICAN DOCUMENTATION 


INSTRUCTIONS TO AUTHORS 


American Documentation is a publication of the Ameri- 
ean Documentation Institute. It is a scholarly journal in the 
various fields in documentation and serves as a forum for 
discussion and experimentation. Papers already published or 
in press elsewhere are not acceptable. For each proposed 
contribution, one original and two copies (in English only) 
should be mailed to Mr. Arthur W. Elias, Editor, Amen- 
can Documentation, Institute for Scientific Information, 
325 Chestnut St., Philadelphia, Pennsylvania 19106. The 
manuscript should be mailed flat in 8 suitable-sized en- 
velope. Graphic materials should be submitted with suitable 
cardboard backing. 


Types or Manuscripts: Three types of contributions are 
considered for publication: full-length articles, brief com- 
munications of 1,000 words or less, and letters to the editor. 
Letters and brief communications can generally be pub- 
lished sooner than full-length manuscripts. Books, mono- 
graphs, and reports are accepted for critical review. Two 
copies should be addressed to the Review Editor, Dr. 
T. Hines, 54 North Drive, East Brunswick, New Jersey. 


PnocBsSING: Acknowledgment will be made of receipt of 
all manuscripts. American Documentation employs a re- 
viewing procedure in which all mansucripts are sent to two 
referees for comment. When both referees have replied, 
copies of their comments are sent to authors with the 
Editor's decision as to acceptability. The refereeing pro- 
cedure requires about 30 days. Authors receive galley proofs 
with a five-day allowance for corrections. Standard proof- 
reading marks should be employed, Reprint order forms are 
forwarded with galleys. 

Format: All contributions should be typewritten on white 
bond paper on one side only, leaving about 1.25 inches (or 
3 сто) of space around all margins of standard, letter-size 
(8.5 X 11 inch) paper. Double spacing must be used through- 
out, including the title page, tables, legends, and references. 
Тһе first page of the manuscript should carry both the first 
and last names of all authors, the institutions or organiza- 
tions with which the authors are affiliated, and notation вз 
to which author should receive the galleys for proofreading. 
АП succeeding pages should carry the last name of the first 
author in the uppet right-band corner (0.5 inch from the 
top) and the number of the page. 


бтүше: In general, style should follow the forms given in 
the Style Manual for Biological Journals (SMBJ), published 
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Pop Science 


Eprror’s Nors: American Documentation ts privileged to 
publish these remarks. They have been made available as 
the text of a presentation delivered at the Gordon Con- 
ference on Scientific Information in July 1963. 


Of course I chose my title by analogy with Pop Art. 
Paintings of endless rows of Campbell's Tomato Soup, 
three-dimensional trompe Рей Brillo boxes— what 
strange pieces these are. How unattractive, really. I 
‘suspect most people don't like Pop Art. And of course 
| the remarkable thing is that the artist paints them to be 
: objectionable. He waits to so engulf us with the banality 
' of our society, with’ its misuse of art, that we will do 
' something to change the world or at least to change 
ourselves. 
The Pop Artist — the good one — is trying to rub our 
' nerve ends raw so that even as lovely a piece of commer- 


‘cial photography as а well endowed 36-26-36 in a four . 


poster will still rouse some objection in us. 

She's lovely, isn’t she? 

Makes you feel warm inside. 

But she's not there for the purpose you have in mind. 

She's there to sell cosmetics. Silicone-based cosmetics at 
that, — a product of the silicone chemists at Union Car- 
bide. So there is one connection — however tenuous — be- 
tween.pop art and pop science. 

But I don’t want to pursue that connection. Indeed this 
entire prologue is designed only to restore ambiguity and 
multivalued criteria to what I’m sure has been an orderly 
week with matters settled by the strict rule of the sci- 
entific method. 

My text for this evening’s sermon — if sermon it ђе — 
comes from the French mathematician, Poincaré. 

On fait la science avec des faits comme un fait une 
maison avec des pierres; mais une accumulation de faits 
west pas plus une science qu'un tas de pierres m'est une 
maison. 

In English: One builds science with facts as one builds 
& house with stones, but а pile of facts is no more а 
science than a pile of stones is a house, 

I wonder if it is so. Oh, it is not without truth — surely 
there is an elaborate and careful structure of the facts of 
science tied together with the theories of which a mathe- 
matician like Poincaré could be so proud. 


DANIEL |. COOPER 


International Science and Technology 


But that structure is in good measure the work of, 


teachers, of the writers of review papers, of the rappor- 
teurs at our ever more numerous conferences. The build- 
ing of science is a more chaotic process. Long before one 
has the mansion of science one has the tas de faits.... 
the pile of facts. 

To press my luck on the analogy: the building of sci- 
ence, so elegant, so overwhelming in its final form, passes 
through a stage when the workmen’s materials are strewn 
about and the workmen themselves — dirty, sweaty, clad 
in over-alls — lounge in what will someday be the great 
court. 

What in the world has this to do with Pop? Well what 
ig more of the people — which is what popular means — 
than the workman. I see him in New York, muscles 
rippling, chewing on his hero sandwich, calling “Hey, 
goodlooking!” or “Bella! Bella!” to the lovely young 
things who pass below his noonday perch while we poor 
professionals can only stare in hungry silence as we trot 
off to our martinis. It’s all pop. Or quack, as Saul Bellow 
has it in Herzog. 

Well now, why divert you — near the end of a week's 
serious discourse on scientific information problems — 
with images of pretty girls and all that conjures up. Per- 
haps it's jus& because you have been serious — and ab- 
stract — that I feel the need to divert you. I want to 
return your attention to the individual acts of generating 
& new idea that sum up into the body of scientific informa- 
tion that has been your concern. 

You see, the tradition has grown of making the research 
paper & smoothed-over, tight exposition of the end point 
to which a man's researches brought him. Agreed: As part 
of the discussion of where things stand, the paper may 
contain some allusion to the yet unresolved problems of 
the field. But no sense of the research itself, of the gnaw- 
ing uneasiness, the discomfort the researcher experiences 
at having some part of this subject that he loves unclear. 
That’s what drives him to seek clarification, to stay in the 
lab till all hours, to ignore his wife. Because the pleasure 
that comes with clarification surpasses all other pleasures. 

Robert Wilson of Cornell, in an interview in our maga- 
zine, likened the whole experience to throwing up. This 
awful queasiness, this rumbling around inside, this sub- 
conscious knowledge that something is going to happen — 
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that the queasiness can’t be maintained. This body 
knowledge: Апа then the relief— the resolution, the 
‘orgiastic moment of relief, the pleasure of having tension 


- disappear — like finishing a speech, or finishing the prep- 


aration of it. : 
To do research is to. be in a special state of grace. I've 
written of the special look I've observed on the faces of 


' physicists at meetings of the American Physical Society. 


These are bearers of a special sort of knowledge, these are 


. —in the main — happy men. They are priests of & true 


religion. 

But such i8 the nature of many of these men, or such is 
their training, or such is the tradition of the scientific 
paper, that few of them talk about their state of grace. 
And fewer yet write about it.” 

I say we're the poorer for it: We're the poorer because 


 ^.' writing is a means for sharing experience. Insofar as we ` 


share the smoothed-over, prettied-up, rationalized matson 


22206 science rather than the tas de pierres — to that extent 


efficient than written: At conferences such as this there ің | 
` ^ the man-to-man opportunity to confess error, to recount 


we are all impoverished. I maintain that our archival ' 


journals suffer for not recording more of the raw experi- 


ence, That's why conferences have become so popular ' 


even though ral communication is inherently less in- 


the blind alleys, to explore fresh paths together, to dis- 
cover that.we are not alone in our stupidities. 


ES ‘Now I'm not proposing that the archival journals be~ | 
.. come something akin to a poetry reading in a fifth-rate 


.Greenwich Village Coffee Shop — all passion and по con- 
tent. I don’t mean for the Journal of the American Chemi- 


`. cal Society to become Chemical Confessions or Chemistry | 


Confidential, But I do ask for some leavening of some of 
the standard research papers with some of the actualities 


_of the experiment, not the prettied up “Results” with 


"nn 


their echo of our schoolroom experiments with their pre- 
ordained outcomes. Not Chemical Confessions but Chemi- 
cal Candor. Mind PLUS Emotion. 

It’s interesting that this sort of thing does take place 
with increasing though insufficient frequency in our review 


and interdisciplinary journals. Here is Science magazine 


carrying a debate about Superconductivity between P. W. 


Anderson and Bernd Matthias, both of Bell Labs. You ean. 
7 get some sense of the quality of this piece from the open- 
', ing paragraph of Anderson's remarks: | 


Тһе conditions under which this article ін being written 
are unusual. With the other side of the coin.being ably 

- presented by my colleague, В. T. Matthias, I will not 
have to qualify my statements or judiciously distribute 
credits and concessions, but can flatly state my opinions, 
for what they are worth. I suspect that I will be proved 
wrong in some measure; I hope the fact of. my stating 

. these opinions will stimulate other physicists to try to 
prove me wrong. 


. And the article follows in that spirit. 


` Notice. another. remarkable result of setting MES the 
normal, formal structure of the research paper for this 


. debate in print. The paper is more personal; in the heat 


54 | American Documentation — April 1906 | 


Обе о элче COSE RUE ge temet р 


‚ of preparing for the debate Anderson drops the customary: | 


(and awful) impassive voice and says “1.” "I suspect," “1 
hope," “I will not"; in fact he uses one first person pro- 
noun or another 9 times in a three-sentence opening para-. , 

graph. Surely some sort of record for a paper in а sci- - 


. entific journal, 


Why, it ig even more than I find ina sd ЕТЕ 
of what is-my favorite reading whenever I return to 
Boston: the Confidential Chat page of the Boston Globe. ` 
On that page sweet old ladies carry on public correspon- 


. dence under equally sweet code names. Here’s one: 


Chat Editor — I never miss an isgue of the Chat, believe 
me. I pes it on to my daughter and she passes ‘it on to 
а neighbor. Quite often there are things cut out before 
they get it, but they do not mind because they are glad | 
to get it. Sunday’s pages were a real bonus. Wish it: 
- could happen more often. I will return to my old pen 
name because since I dropped it I have never once seen 
3t used. 
I have six dogs and a cat’ (they get along fine) and а. 
very large vegetable garden besides sewing, knitting and . 
canning. Life is beautiful, my days are full, never-time 
for everything. Add to that the Boston Globe, what 
' more do I need? 
: —Hilltop 


Now ded do I bring that in? Listen to the lady: 


I ш six dogs and a cat (they £et along fine) and ғ а 
large vegetable garden besides seving, knitting | 
Ma canning. 


""Bhé's proud of what she’s doing — it's what she is. It 
provides — to use that horrible word — IDENTITY. ` 


Susan Langer in the introduction to her Philosophy ina . : 
` New Key, a fairly steep book about art and symbology, _` 


reminds us that all of art has much in common with the 
instincts tbat cause us to Sow off our mud pes when we c 
are young. 

` I hope you won't forget as you edit scientific papers, 
then abstract them, then put the titles into a key-word, `. 
index, then study the statistics of such titles, then prepare ` . 
the whole for instantaneous electronic retrieval, and then | 


hold conferences — pleasant conferences — on the whole: 


subject. . . . I hope in all this you won't forget that you . 
are dealing with men’s passions and hopes, with а some- , 
times desperate attempt to leave some sort of scratch on 
the face of anonymity before death overtakes us all. — 
What I'm saying is that a catalog of Picasso’s paintings, 
while necessary, is still not Picasso. M 
What I'm saying is that every bit of Scientific Com- 


` munication is at the same time a showing off of a mud pie: 


“I have six dogs and a eat (they get along fine) and a 


very large vegetable garden besides sewing, knitting and e 


canning." 
What if the mud pie slumps! 
What 1f the cat gets chased once in & while 
What tf the paper ain't so vital. 
It’s my mud pie, my pets, my paper. . E 
So my plea i8 for & recognition of the Pop, in science and ) 


` fora search for ways to convey it. 


љета от 


We've found one way in International Science and 
Technology: a sort of interview in which good scientists 
and engineers talk in а very personal way about how they 
do сепсе and what it means to them: 

Bernd Matthias of Bell Labs in an interview entitled 
The Gambler in the Laboratory : 


You see, I like to gamble, and I do the same thing in 
physies. I look for new things. If you do this, there are 
three possibilities. Either you find what you are looking 
for, or you find something else, or you don't find any- 
thing. If you don't find anything, there is nothing you 
have to show for it. Oh sure, today people want to pub- 
lish negative results, but it is always an anticlimax. I’m 


quite willing to gamble. If I find things, fine; if I don’t, 


well, I’ve just lost. 


Bob Wilson of Cornell, speaking of The Pleasures of 
Physics, telling how he very nearly won World War II 
singlehanded : 


It was late 1941 — right in the worst moments of the 
war. The time of Pearl Harbor. The Battle of Britain 
was just over, but things were still at a very low point. 
If one could make a bomb, that would be the salvation 
of the world, not the damnation of it. 


So with this idea and in that desperate situation, I 
thought, “My God! I just have to learn how to separate 
isotopes!” I thought long and hard — intensely for a 
number of days. My thoughts turned to the electrical 
methods. I can still remember vividly the clear cold air 
and the experience of walking through it and thinking 
“By God, it’s going to come.” And come it did. I was 
conscious that the idea was there within me before it 
finally revealed itself to me. That idea subsequently 
became known as the Isotron. 

а extremely excited, and, as І walked along, my 
ego became all involved. I felt, I, a young man of about 
28, would almost personally win the war. In a few 
months, if we worked hard, we at Princeton could test 
this thing, we could get а few grams of U-235, and then 
we would make а bomb and stop the war. And we could 
have. It was possible — had the neutron cross sections 
worked to be larger, as the British thought, it could 
have happened. 


The mud pie just slumped — but what а mud ре... 
what & moment for any man to experience. 

Here is F. C. Williams of Manchester, the inventor 
of the Williams Storage Tube, telling all of us off in an 
interview entitled How to Invent. 


I think it’s a great mistake to-learn too much, to be 
taught too much, to be too good at anything, because 
this tends to become important in itself. It’s just no 
good knowing about these things, if you’re not going to 
do anything about them. You might just as well study 
Shakespeare and know all about that, because you're 
not gomg to do anything with that either. There's a 
great danger you know, that scientific education will go 
that way. That it will become a virtue within itself to be 
able to do things that can already be done — whereas 
the true virtue 1s to be able to do things that have not 
already been done. 


І would suggest, if you're trying to make progress nowa- 
days in the computer business, you can do one of two 
things: You can either work on your own and try and 
make some progress, or you ean keep abreast of what 


other people are doing. But you damned certainly can't 
do both. ... 


"There's only one easy place to be in science and in engi- 
neering, and that’s in the front. If you're there first, you 
have nothing to read. You've got all your time to think. 


Around now some of you have a perfect right to pro- 
test. 

I can almost hear the thought waves: It's perfectly all 
right for you as the publisher of & pop magazine to plead 
for the pop in science, but I'm a librarian — excuse me, & 
documentationalist. My motto is Orderliness. My de- 
calogue comes not from Moses but from Dewey. My mud 
pie is to keep other people's mud pies in straight rows. 

Well, I would respond to the good lady (I know librar- 
ians are not always women, it’s just more pleasant to 
think of them that way) by first pointing out my com- 
plete respect for her craft. Having demonstrated this 
evening that I can barely keep script and slides and refer- 
ences together, I trust I don’t need to elaborate on how 
highly I regard someone who can keep all of the world’s 
knowledge pigeonholed. But, you know, there is pop in 
the library, too. Sure the books are all in the stacks where 
they should Бе... (Or almost во. One of the delights of 
& library is coming on а good book on the Mexican War 
when you are searching in decimal classification 749.2 for 
& book that will tell you how to build that wall your wife 
wants. What а welcome diversion for а man whose mud 
pies — and walls — always slump!) 

But by and large the stacks are orderly, so I shouldn't 
ігу апа put you off that way. Rather I would make the 
point that there's а lot more going on in the stacks than 
you realize. People are browsing. The stacks are orderly, 
but our minds are not. 

Ав long as you as a librarian keep your stacks and 
shelves open a lot of pop is taking place: 


What а, nice book this is! 

What's that blue book about? 

I like blue bindings. 

What in hell do we get the Journal of Oral Surgery for? 
Еіс... Ее... Etc. 


Pop...Pop... Pop. 

There’s more. My librarian — back home in South 
Orange — usually has a suggestion for me. You'll like this 
book, Mr. Cooper." "I didn't like Herzog but you might, 
Mr. Cooper.” And because I can’t say no to anyone, and 
because I know that the librarian is depressed because the 
town voted down a new library (three times now), I 
usually accept her suggestion and am glad for it. 

What does this suggest for documentationalists? Well, 
maybe these things, and now I am serious — sort of: 

(1) I worry when our journals, our abstract lists, our 
information-retrieval schemes are not in danger of be- 
coming too efficient, too adjusted to what the customer is 
thought to need. I hope the narrowest will always keep 
some window on the world, some device for providing the 
unexpected, the pop, some provision for browsing. 

(2) Specifically, I would hope that those of you who 
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- thought . 


write computer programs for information retrieval ‘will 


| provide for browsing in the reference lists thus prepared 


: maybe by having thé machine read two numbers'out 
of & random-number table and let the first signify the 
location and the second the title of а random (but good) 


reference that would stop your reader in his too avid 
: pursuit of papers on the melting point of gallium. He may 
' benefit so much more from knowing that new techniques 


exist for automating such measurements. Or he may dis- 
cover that the galaxy he inhabits is bigger than bad been 
. that sort of discovery puts the meine, point 
of йиш into perspective, somehow. 

(3) T don't see why abstract services and computer 
programs can’t have opinions like my librarian. A nice 
little journal called The American Behavioral Scientist has 
a section devoted to abstracts of the literature. It’s com- 


plete — I believe — but also evaluative. Abstracts that : 


strike the editors as being of more than ordinary interest 
are surrounded by а box. And, thus highlighted, the 
reader is guided to the best of what otherwise would be a 
dull-looking list. 


(4) Speaking of dull- looking, can’t you folks get: those 
' eomputers to print out in more interesting type faces? 


Frankly one thing that keeps readers away from com- 
puterized abstracts in droves is the fact that they look 
like they were prepared by machines for machines. My 
whole point this evening is that we are not machines . . 

that all this communication with which we are concerned 
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is an attempt to transmute the Pop that had occurred in. 
one man's mind into а Pop in the minds of his audience 
wherever they may be. 


Just a word more on the meanings of. Pop Science and 
why all this із so important. _ 

First, as you all know, science is getting much, bigger: 
Extrapolating ahead it would seem that all mankind will 
some day be scientists. If we are not to have a vulgariza- 


tion, a degradation of science — а pop science now іп the '' 
. same sense that the pop artist is protesting the popular. 


vulgarization of his métier — then we-must communicate 
the spirit, the actualities of science as well as its facts. 
Second, as science impinges on our lives more and more 
the temptation grows to misuse science for commercial 
gain. To use science to sell soda pop. I mean the irritating, 
false use of science in TV and newspaper advertising. 
Finally there is science as Pop . . . Dad, Father, The 
Old Man. Let’s face it— this is an age of science and 
technology. People like us develop the principal instru- 
mentalities for our modern society, for this postciviliza- 
tion, аз Kenneth Boulding has called it. 
It’s terribly important that the technology thus de- | 
veloped not be mindless— that it be applied with due 
regard for the consequences. It’s perhaps even more im- 
portant that it not be heartless; devoid of passionate: con- . 
cern. That, really, is why I chose to carry you. in, this 
direction tonight. | 


А ИЕ бэта л ува ДА, о УМА 


The KWIC Index Concept: А Retrospective View 


This paper defines and describes the KWIC (keyword 
in context] index concept, providing a history of the 
concept and of its literature. It discusses variations of 
the index, such as the Bell Telephone Index, KWOC 
indexes, and the WADEX. 

_ The paper discusses improvements and variations 
to the KWIC index, such as manipulation of the index 
line, variations of the code, addition of classification 


€ A Review of the Literature 


The classic paper on the KWIC index is Keyword-in- 
Context Index for Technical Literature (КІРІС Index) 
published by Hans Peter Luhn in 1959. This paper intro- 
duced the idea and the plan for a permutation index 
based on titles, and produced by machine. 

Earlier, at the International Conference on Scientific 
Information (1958), both Luhn and Ohlman? had dis- 
tributed copies of machine prepared permuted indexes 
which each had developed independently. 

The use of the KWIC index in the preparation of 
Chemical Titles was described in a comment published in 
Law Library Journal in 1960.5 

Papers by Lester Douglas Turner and James Henry 
Kennedy, appearing іп 1961, explained SAPIR (system 
of automatic processing and indexing of reports), which 
used the Keyword-in-Context index principle.* The same 
year an article by John H. Veyette, Jr., fixed the KWIC 
index in a pattern of information retrieval,’ and an article 
by A.:Resnick similarly bore upon an aspect of informa- 
tion retrieval as related to the KWIC system.’ 

By 1962 the KWIC index and variations of it had 
become widespread enough in use to occasion further 
explanation, evaluation, and criticism. For this period, the 
General Information Manual issued by ІВМ may be 


eonsidered as an authoritative source of information con- 
| 
* Тһе author wishes to gratefully acknowledge the generous help given 
her by International Business Machines Corporation, particularly that of 
Charles F. Balz, and of Dr. I. А. Warheit and John Н. Gustafson. 


information, combination of author index and title 
index, and improvements of the type face. It also 
discusses improvements to the preparation of the 
KWIC indexes, such as improvement of titles and use 
of a thesaurus, and discusses improvement of the use 
of the KWIC index. The paper discusses the usage of 
the KWIC index and comments on the future of KWIC 
indexes and of the KWIC concept. 


MARGUERITE FISCHER * 


American College о] Physicians 
Philadelphia, Pa. 


cerning the KWIC index.’ Instructions for production of 
KWIC indexes came from the Space Guidance Center, 
IBM, in & report by Charles H. Balz and Richard H. 
Stanwood. At the same time, Frank V. Giallanza and 
James H. Kennedy wrote of the KWIT (Keyword-in- 
Title) Index used at the Lawrence Radiation Laboratory, 
discussing possible options for preparation of KWIC in- 
dexes of subject, author, report number, and field-of- 
interest.? At the University of Oklahoma, a research proj- 
ect was underway to use the KWIC program for retrieval 
of space law materials. The American Diabetes Associa- 
tion was considering use of the KWIC index to provide 
one of a number of desired indexing depths.1? Stanwood 
proposed the Merge system, a complete information sys- 
tem linking the techniques of Keyword-in-Context with 
SDI (Selective Dissemination of Information).1? Library 
applications practiced at Bell Telephone Laboratories 
were reported by R. A. Kennedy.? Donald H. Kraft 
evaluated the efficiency of the KWIC principle on the 
basis of his study of legal document title entries.1* Wil- 
liam J. Kurmey undertook the comparison of keyword-in- 
context effectiveness with that of subject heading effec- 
tiveness.15 In England, J. D. Black described the KWIC 
concept as offering advantages not possible with conven- 
tional indexing, and cited user reaction to KWIC as being 


favorable!? Mary Veilleux described a “man/machine” 


system that had been in operation since 1952 at Central 
Intelligence Agency (CIA) which, unfortunately, had not 
been generally known until 1961.7 

The literature of 1963 indicated an expanding interest 
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in KWIC and its variations. New uses for the KWIC tech- 
nique were tried and reported. Richard L. Storrer de- 
‚ seribed its usefulness in indexing memoranda and letters.!* 
It was used for section indexes in cumulative indexes to 
eomputer program abstraets, indexes to special collections 
of publieations, indexes to research and development 
projects, indexes to programs of professional meetings, 
and in concordances. In other applications, the KWIC 
7 technique was used for an index to program titles, indexes 
to branch office manuals, and an index to manager re- 
sponsibilities. 

Logically, great interest began to be felt and expressed 
on the subject of titles. Mary Jane Ruhl? Lawrence 
Papier, Walter Вгапдепђегр 2: and Jessie Bernard and 
Charles W. Shilling?? were among those commenting on 
the validity of using titles as & basis for indexing апа 

suggesting improvements to titles. 

Numerous variations to KWIC had appeared by 1963. 
E. À. Ripperger and others reported an index called 
WADEX which combined author entries with word en- 
tries? H. R. Newbaker and Т. R. Savage wrote on the 
SWIFT program, which. combines features of the KWIC 
with traditional format, and which depends on titles 
elaborated to the point that they are called notations of 
content (МОС) + Physindex, a subject index halfway 
between KWIC indexes and conventional alphabetical 
indexing, was described by Nicole Chonex, Andre Chonex, 
and Jean Iung.?5 Need for author participation in writing 
informative titles and the possible difficulties arising with 
this need were topics treated by Saul Негпег,26 by Т. Е. 
Conolly,?' and by R. A. Kennedy.?? 

The use of editing to improve the KWIC index was 
discussed by Phyllis V. Parkins,2° and pre-editing was 
mentioned by Robert R. Freeman and G. Maleolm Dyson 
in a history of the development of Chemical Titles? 
Variations from the Luhn code were noted by Freeman, 
and by Н. East and others.?? 

In 1964 B. B. Lane published the results of a study of 
title validity in technical and non-technical fields, indicat- 


ing that in non-technical fields titles reveal the contents 


less frequently than in technical fields.3* John M. Sedano 
in a similar study proposed that the concept of “a tech- 
nical field” should not be restricted to science or engineer- 
ing, but should apply equally to any specialized area of 
knowledge, and cited the excellent use of highly descrip- 
tive titles in the Public Affairs and Information Service 
Bulletin?* Marguerite Fischer suggested that titles in the 
non-technical fields might be made KWICable by the 
stylized use of part titles, a device common to literature 
of the 17th and 18th centuries.?5 

Without attempting to cover all publieations in 1965, 
it шау be noted that a state-of-the-art report by M. E. 
Stevens on automatic indexing, contains valuable refer- 
ences to KWIC,®* particularly in the area of early, un- 
published materials and work. 
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• A Preliminary Look at the KWIC Index ` 


The KWIC index апа other permuted indexes are 
among the group of new indexes that are called “uncon- 
ventional indexes" to differentiate them from subject- 
heading indexes or classed indexes, which are called “соп- 
ventional indexes.” - The underlying principle of the 
KWIC index is that words instead of concepts сап be 
used for indexing. Keywords — i.e., catehwords or essen- 
tial words — can be extracted from the title, abstract, or 
text, апа can be used effectively in the index. The context 
about а keyword helps to define or explain its use, in 
order to lead the index user to the exact article, paper, 
or other bit of information he desires. The KWIC index 
is used chiefly with titles; however, it also can be used 
with abstracts or with whole texts. Furthermore, it can 
be edited manually with addition or deletion of words. 

Generally, each KWIC index is preceded by an intro- 
duction and a stoplist. The stoplist lists words that are 
not meaningful for indexing purposes and that are ex- 
cluded from the indexing process. Words included in 
stoplists vary from index to index and even from time 
to time in the same index as experience and circumstances 
dictate. 

In the body of the index, each line consists of three 
parts: the code, the index word, and the context. The 
parts and their arrangement vary from index to index. 


© A History of the KWIC Index 


‘Seen historically, the keyword and the permuted index 
were not altogether new when Luhn invented the KWIC 
index, but were devices recovered for machine adaptation 
from practices of old European libraries.88 "Indexing by 
key words, with meaning clarified by context was not 
new. Scholarly concordances have been known for cen- 
turies.” 8° A, Crestadero’s Art of Making Catalogues of 
Libraries, published in London in 1856, more than one 
hundred years before the invention of KWIC, referred to 
the concept of the permutation index. Also, German 
libraries were using the schlagwort — the “catchword,” 
or the “keyword,” in the idiom of this paper — in their 
cataloging procedures one hundred years ago or earlier.* 

Observing Luhn's background, while keeping in mind 
ihe historical precedents for permutation and keyword 
indexes, it is interesting to speculate that Luhn's early 
acquaintance with German libraries, as a student or as 
the son of a German printer, may have led to his later 
use of the catchword or the keyword in the title as an 
indexing device for KWIC. 

In the early 1950's many people began to look at com- 
puters or machines as possible indexing tools. The Central 
Intelligence Agency as early as 1953 began to prepare a 
permuted title word index using keypunch, reproducing 
punch, sorter, and tabulator.!? 

Many different points may be thought of as the begin- 


ning of the KWIC index. Black selected the point at 
which 


the Pontifical Faculty of Philosophy in Milan decided 
that they would make an analytical index and соп- 
cordance to the Summa Theologica of St. Thomas 
Aquinas, and approached IBM about the possibility of 
having the operations performed on Data Processing 
machinery. . . . Experience gained in this project con- 
ае towards the development of the KWIC In- 
ex,16 


In 1958 “KWIC was coined by H. P. Luhn ... at 
about the same time Citron, Hart, and Ohlmen were 
developing & Permutation Index to the Preprints of the 
International Conference on Scientific Information. . .” 87 
Luhn's “KWIC method offered easy and extremely 
rapid handling of large volumes of information, relatively 
simple preparation of the input to the computer, and 
output of a product which was readily reproduced by 
photographic offset methods and easy to use.” 29 

In the fall of 1958 the Chemical Abstracts Service be- 
came convinced that the KWIC index designed by Luhn 
соша be developed as & scheme for indexing the titles of 
chemical eommunieations.? A $150,000 grant by the 
National Science Foundation’s Office of Science Informa- 
tion allowed the Chemical Abstracts Service to develop 
the keyword indexing scheme, and in April 1960 the 
Service distributed the first seven thousand sample copies 
of the index to registrants at the Cleveland meeting of the 
American Chemical Society. 

From 1960 to 1962 over thirty applications of the 
fundamental techniques of the KWIC concept were 
made!’ and since 1962 applications have increased even 
more rapidly. 


9 Users of the KWIC Concept 


It is impossible to compile a complete list of the users 
of the KWIC concept since it is impossible to determine 
exactly who is using it. The concept is so simple that 
anyone with access to a computer can use it-easily and 
effectively for the solution of their particular indexing 
problems. Even manual use of the concept is possible, 
although this is practical only for small indexes? Still, 
consideration of some of the principal users of the KWIC 
concept and of the manner in which they use the index 
will indicate its growing importance and popularity. 

The Bell Telephone Laboratories, in 1959, decided to 
use a permuted index. The studies started at that time 
resulted in the development of a variation of KWIC, the 
principal characteristic of which is the use of 120 char- 
acters per line instead of the 60 characters per line usually 
used with KWIC indexes. 

Biological Abstracts first published its permuted-title 
subject index in October 1961. The editors christened the 
index BASIC, standing for Biological Abstracts Subjects 
in Context. BASIC differed from the earliest KWIC in- 
dexes by providing access to abstracts rather than to 
itations alone. With time, other departures were made 


from the original KWIC index, and editing, or “vocabu- 
lary management,” evolved.?® 

Other users of the KWIC index may be briefly noted: 
the KWIC Index to the Science Abstracts of China, is- 
sued December 1960 by the MIT Libraries, listed some 
3,800 Communist Chinese papers; ће KWIC Index to 
Neurochemistry, prepared in 1961 by the Mimosa Frenk 
Foundation for Applied Neurochemistry in cooperation 
with IBM, listed some 2,100 papers; Dissertations in 
Physics, an indexed bibliography of the 8,418 doctoral 
theses accepted by American universities from 1861 to 
1959, compiled by the IBM San Jose Research Labora- 
tory апа published by Stanford University Press, 1961; 
Keywords Index to U. S. Government Technical Reports, 
a temporary publication published biweekly by the U. 8. 
Department of Commerce, Office of Technical Services; 
Index to Legal Theses and Research Projects, published 
in July 1962 by the American Bar Foundation; Current 
State Legislation, a KWIC index of bills enacted by 50 
state legislatures, also published by the American Bar 
Foundation; Current Medical Terminology, published in 
1964 by the American Medical Association; Kansas Slavic 
Index, published by the University of Kansas; and Me- 
teorological and Geoastrophysical Titles, published by the 
American Meteorological Society. Chemical Patents, a 
publication of the American Chemical Society, was first 
published in 1960, but failed. A commercial service to 
libraries, Librarymaster Services, incorporates keyword- 
in-context indexing among its other services. Companies 
using KWIC indexes for internal reports or manuals in- 
clude IBM, Lockheed, the Allison Division of General 
Motors, and Trans-Canada Airlines. Lawrence Radiation 
Laboratory and Sandia Laboratories also use KWIC. The 
Oak Ridge National Laboratory has used it experimen- 
tally in & publication that supplements Nuclear Science 
Abstracts, as Chemical Titles supplements Chemical Ab- 
stracts. 

Considering, then, the very recent origin of the KWIC 
concept, its use has spread very rapidly. 


ө Improvements to KWIC: Variations and Sug- 
gestions 


i. Manipulation of the index Ипе. 


'There seems to be а general lack of satisfaction with 
the index line. Users have manipulated it so that many 
variations have been used. А glance at the first KWIC 
index reveals excessive white space in the index. When 
reading across a short line to the associated code number, 
it is sometimes difficult to determine which code 18 asso- 
ciated with which title. The difficulty is alleviated by the 
use of the wrap-around, or recirculated, title. The wrap- 
around title not only reduces the white space and pro- 
vides improved readability, it brings more context, or 
information, onto the index line. White space still exists 
but not excessively, as seen in Fig. 1. 
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Current State Legislation is shown in Fig. 2. 


The wish to retain the full title has led to Bell Tele- 
phone’s use of a 120-character index line instead of the. 
60-character index line normally used in. KWIC. The 
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PORTRi-bi-LOS 
FROM] -60-RAM 
HAMAR F- H-UR A 
0 А.73-58-Р15 
ROCKHE-ST-USR 
TORNE -&i-TUO 
W07949—62-w$f 
HERBE -62-FwK 
TAUBR T- ASO 
СНЕҮВР-&1-1АА 
101844-5%-ААО 
WHITHS-60~KEC 
SHEAP ~Sy-acu 
ТОЗАН-56- TPL 
OHLMH -5I-LCP 
AMDROD-S?- ILA 
ВАОМАҒ-57-5МА 
NOOwA5-61-$0G. 
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MPOÍIUM ENDORSES MEW ,, ACTION PLAM,, ЕК. INFORMATION 57 M01541-62-415 + 
THE ae É.J.C. ACTION PLAN, MOMARK-62-EJC 
DOCUMENTATION IM АСТІСИ. = SHERJH-56-DAÀ 
ACTIFILM JM ACTION. JOMERE-& 1- АА 
= ORGANIZATIONS ACTIVE -IN MACHINE INDEXING RESEARCH, — HENDMB-&1-OAM 


МЕ TRAMSLATION.= THE ,, ACTIVE,, AMD ,, PASSIVEr: GRAMMAR OF 


Е L39.- STANDARDITATION ACTIVITIES OF AMERICAN STAMDAROS ASS" 


BRARY ANC DOCUMENTATION ACTIVITIES © INFORMATION THEORY 1А t 
= MANAGEMENT /S VIEWS ОМ 
AICR 
MEASINGLY IMOISPEMSABLE ACTIVITIES." IMFORMATION RETAIEVAL,« 
ELEM Of AM EXPERIMENTAL ADAPTATION OF PUNCHED CARDS АМО MECH 
LY OF TAX IMFOAMATION.s ADAPTATION OF MODERN MECHANICAL AMD 
ИСМЕО CARD SYSTEM.» THE ADAPTATIOM OF -THE ASR- SLA METALLURG 
ACHEO CARD TECHNIQUES. = ADAPTATION OF THE „О.С. FORM Of МОТ 
INFORMATION METHODS, ADAPTATION TO GROWTH. © 
CHED CARD ACCOUNTING I$ ADAPTED TO MAKE SCHOLARLY INDEXES.” 
SORTING АКО COLLATING.= ADAPTING MACHINE SORTEO PUNCHED CARO 
5 BETWEEN MEM АМО THEIR ADAPTION IM JHE OOMAIM OF MACHINES F 
МЕ.» MACHINE POSTING.= ADAPTION Of COORDINATE IMDEXING SYST 
ADAPTIVE INFORMATION RETRIEVAL.” 
SHING FOR FACTS., FIANS ADD SPECIALISTS IO HANDLE RISING TID 
ALIZATION OF 5ТМТАК,..= AQDENDUM ТО THE PAPER ,, SOME PROBLE 
NICAL INVESTIGATIONS ОЁ ADDITION ОҒ A HAROCOPY OUTPUT TO THE 
RY FILE.= PROPOSALS FOR ADDITIONAL GRAMMATICAL COOIMG IM THE 
8 OF FUEL AMO LUBRICANT ADDITIVES BY MACHINÉ- SORTED PUNCHED 
M OM FURL АМО LUBAICANT ADDITIVES FROR MACHINE SORTED PUMCHE 
ADORESS TO IBM LIBRARLAMS AT OWEGO.® 
` Aer A MULTI- ADDRESSABLE RANDOM ACCESS PILE SYSTE 
AMALYSIS OF A FILE ADORESSIMG METHOD, = 
MAL REPRESENTATIONS AMD ADEQUACY PROBLEMS.» FIMITE-ITATE LAN 
PROCESSING SYSTEMS - AN ADEQUATE SYMBOL SYSTEM POR ENGLISH А 
REVIEW ОҒ MOON, ADJECTIVE, АМО VERB SUFFIXES.* — . 
том FLOW.» SASSIF (SELF ADJUSTING SYSTEM OF SCIENTIFIC [AFOR 
AY THE US COVERMMBENT.© ADMINISTRATIVE АМО SCIENTIFIC PROBLE 






' ер CARDS IX SCIENCE АМО ADMIMISTRATIOM.e THE USE Of PUNCH 


HALMOLOGY.= 5СТЕМТІРІС- ADMIMISTRATIVE OOCUMÉMTATION IM GENE 
RANOUA WRITTEN FROM THE ADMINISTRATIVE POINT OF ІЕМ OM THE 
PROGRESS AMD PLANS FOR ADOPTING AUTOMATED TECHNIQUES TO REL 
АОФ.» KEY TO IMPORMATION RETRIEVAL. © 
MUMAN ASPECTS OF АОФ5.4 

ERSITY LIBRARY.- ADVANCED DATA PROCESSING IM THE UNIV 
АМО HOAROR.e (REVIEW OF АСУАМСЕО DATA PROCESSING IM THE UNIV 
В 1. KMIC/ ADVANCED SYSTEMS IN THE FIELD." / 50 
PROCÉDURE MANUAL.* 18M ADVANCED SYSTEMS DEVELOPMENT AND RES 
AN JOLE, CALIF.» А500 1 ADVANCEO-SYSTERS-DEVELOPMENT-DIVISIO 

AOVARCES IM CLASSIFICATION. « 

К ADYANCES IN MACHIME TRANSLATION. а. 

ROIO COMPOUNDS.= RECENT ADVANCES [N PATENT OFFICE SEARCHING. 
OUNOS AMD ILAS.© RECENT ADVAMCES IN PATENT OFFICE SEARCHING, 


REVIII-54- MPG 
RINGRE-61-SAR 
MHEUNKE-SA-ITL 
АзтлС-57-МҰТ 
MORSRO-62-AIR 
CHEYOF~61-IRE 
MACKFB-5T-APE 
MAGUJM- k 9- АМА 
Wf ]188-57-AA$ 
SALSEG-AA- AUD 
T02442-57- IMÀ 
T023AN -56- [РС 
XIASS -%1-АЯ5 
CORDS -1-АТІ 
VE 1N$J-59-ACT 
KOCHA -à2-AIR 
STANT -&0-FFR 
КЕҮ111-51-А9% 
102%45-61-ПА 
қаты -59-РА6 
СААНИН Sb LL 
WEILEH-SY- IFL 
LUMNHP-60-AIL 
COIL£A-AO-MAR 
SCHAG -61-AFA 
BARHY -59-FLF 
MALOJA-&1-MLO 
РАСАМ -5T-RMA 
BUSBJC-61-5$A 
109843-41-СІС 
YOUMA -5%-Ц9С 
TRESC -56-5А0 
мат Еа-59-ғы 
ROTEL JH-62-PPA 
М01749-81-мі1 
РОРЕЗЕ-ФО-НАА 
SCHULA-62-ADP 
MOONE -#2-Енн 
N022A1-&2-ASF 
TEO103-59- LAS 
БАТЕМ -50-AAR 
DYSOGN-55- M. 
LANBSM-&l-ANT 
ANORDO-5T-NAP 
PROM -57-RAP 


SEARCHINS.- ADYANCES 1M MECHAMI?ATIOM OF PATEMT LANHAE - 54 -AMP 

MING,* CHEMICAL FIELO.= ADVANCES IM MECHAMIZATION OF PATEMT — LAXOIBE-56—AKP 

Y SCIEMCC.- ADVARCES, IN DOCUMENTATION AND LIRKAN  SHERJH-61-ADL 

^ ADVANCE Y 1м COMPUTERS, = ALT FL,-60-AC 

RCH оч 100х196 SYSTEMS ADVANCES.» RESEA Қ02842-42-815 
» COGE- PUMCHED CARDS., ADVANTAGES АМО LIMITATIONS OF S018 М BARTY -ӛі-РСЕ 5 

. — TRANSFER OF ADVERBS,” RUSSIAN- ENGLISH. = PYME JA-SR- ТАХ 


TARY OF COMMERCE BY THE ADVISORY COMMITTEE ОМ APPLICATION OF 
M INDEXING ООСЦМЕМІ5 OM AFROOYMAMICS.* Ам EXPERIMENT IN МЕТА 
ITED STATES.= AEROMAUTICAL DOCUMENTATION I" THE UW 
Y OM STUOY OF STATUS OF AEROSPACE CORPORATION IN INFORMATION 
SSIPICATIGN PROGRAM FOR AEROSPACE INTELLIGENCE DATA SYSTEM, = 
ROCELDINGS ОҒ THE MARCH AFERO COMFEREMCE OM SCIENTIFIC AMD T 
DICTIONARY. = AUTOMATIC AFFIX INTERPRETATION АМО RELIABILITY 
OM THE DESIGN ОҒ AM AFFIX SPLITTING DEYICE. =` 
TERPRETATION OF RUSSIAN AFFIXES OX A WORO-BY-WORD BASIS, = 
RESEARCH. = THE AFOSR PROGRAM IN INFORMATION SYSTEMS 
FOUNDRY LITERATURE. = APS USES PUNCH- CARD SYSTEM TO INDEN 
MACHINE ACE IM ІНЕ LIBRARY,” 
BIBLIOGRAPHERS IM АЯ AGE OF`SCIEMIISIS.»® 
BUMLIOCRAPMY iM AN AGE Of SCIENCE. 
МАСНІМЕ AGE OVERTAKES PATENTS. 
ROUTINES INVOLVING AGREEMENT." / MACHIME TRANSLATION/ | 





BUSHY -S$h-RSC 
МА ІСАС-59-СРІ 
LOWRWK- Уб ADU. 
KENTA -61-А%У 
АВЛАСТ-41-СРА 
СЕТО0:-%0-ғМА 
SRERME-6 Taal 
соним -58-045 
5НЕКРЕ-5%-АІК 
SHAMA -%2-АРІ 
TOITA T= нв АР 
MOZNA So Ye MAL 
HARLN -53-8AT 
RIDELN-31-8AS 
H025A6-55—wAD 
GARYP -$7-RIA 


1ОР).» THE ROAD AHEAD. INTEGRATED DATA PROCESSIMG I 1022A1-50- RAI 

CIXONIC SEARCHING MOVES АМЕА! ELE N026Ah-ST-ESM 

THE INFORMATION PROBLEM AMEAC.* OW HEILLB-61-1PA 
Е5.• AICHE INFORMATIOW RETRIEVAL ACTIVITI  MORSRD-62-AIR. е 

Абе“ THE AICHE SYSTEM FOR INFORMATION RETAIEY СЕМЕЖР-02-А51 

CHAMICAL AND ELECTAOMIC AIDS FOR BIBLIOGRAPHY,» ME SHAMAR Sh - MEA 


SCIENTIFIC AIDS FOR LITERATURE SEARCHING. 
SCIENTIFIC А105 FOR LITERATURE SEARCHING, « 
DICAL САТА PROCESSIMG.e А105 IN DIAGNOSIS, CURRENT- * NFORMATI 
RODEAN AIDS IM DOCUMENTATION. © /EINMISM/ 
ТОМИ COMPUTING MACHINE AIDS TO А DEVELOPMENT PROJECT.* / DE 
AUTOMATIC AIDS 70 DICTIONARY REVISION.- 
He FOR SCIEMTIFIC AIDS TO LEARNING AT MIT.» TH 


PUNCHED CANDE AS А105 10 QUALITATIVE CHEMICAL АМА ҮЗІ 


“CHAMICAL AMD ELECTRICAL AIOS IO THE ASSEMBLY OP ТАХ 1МРОАМАТ 


PECHANICAL -AIOS TO ІНЕ USE OF LITERATURE = 

У LITERATURE SEARCHING AIDS.» 
не MACHINES AS 1 IWS А105.» OICTATI 
TER LOOKS АГ МЕ MICAL AID$.* ЗА МЕМ ҮСАҚ сна» 

MEORMATIOM SERVICES.» AIMS AMD OBJECTS FOR RESEARCH INTO I 

AIP DOCUMENTATION RESEARCH PROJECT,» 

me тен MOST SIGNIFICANT AIR FORCE PAQIFCIS.© RESEARCH LANGUA 
“ААҰ.а AIR FORCE SPONSORS BACKWARDS DICTION 
TRIEVAL SYSTEM FOR U.S. AIR FORCE.- / AM-GSO>3B/ IMFORMATION 

А MICROFILM LIBAARY OM AIR POLLUTION.» 

EVAL SYSTEM FOR DOUGLAS АТАСААҒТ COMPANY, IMC., STATUS REPORT 

MISSPELLEO MAMES [M AN AIRLINES PASSENGER RECORD SYSTEM," 

d 15510М SYSTEM.* ALDEN DEVELOPS KICROFILM DATA TRANSM 
MY TWO LAMCUAGES VIA‘ AN ALGEBRAIC INTERLINGUA.» GENERAL -PROS 
D RETRIEVAL LANGUAGES.. ALGEBRAIC REPRESENTATION OF STORAGE 

Ач ALGEBRAIC THESAURUS,» 
E ALLOCATION SCHEME FOR ALCOL 40.» A SIORAG 
Р А MACK]ME TRANSLATION. ALGORITHM „= ADPLIEO AND MATHEMATICA 
ТАМ ELECTRONICALLY. = AM ALGORITHM FOR TRANSLATING FAEMCH INT 
ERIYG INTO RUSSIAN.v AX ALGORITHM FOR TRANSLATING EMGLISH TE 





1. Sample from Index with Recirculated Title . Я 


М01%47-51-541. 
ЕмілАН-52-551. 
LEDCRS-&O-UEC 
САОМА -Sá-MAD 
ROSECH-$ 1-СМА 
RERSJ -&1-AAD 
101%4%-51-С5А 
KUENLE-58-PLA 
масу т-%%-Амн 
PARTIR- SA KAU 
RUSTWR-52-L$A 
BERNCL -55-OMT 
SPATE -50-SNY 
HANSE W- 5 0- AOR 
MATIRE-6O-ADR 
x012A8 -&-RLT 
MELE -%2-АРУ 
NO2NAR-52-1RS 
SRITGL-A IMLA 





N02544-&2- ADM 
RICHRH-54- GP 
ҒАТАЯА-59-АЯ5 
РАЯҚАҒ-56-АТ 

JENS) -61-5А5 
ФАОЧҒҰ-61-АМ( 
AGRAVA- SB-ATR 
KOMIKY-SR-ATE 


advantage claimed for the long line is that only 296 of 
the titles listed with the line are chopped off, whereas 

` 30% of the titles listed with а 60-character line are 
chopped off. The problem of excessive white space ap-. 
pears with the use of the 120-character line, but does not 
seem to be as distracting as it is with the 60-character 
e. Àn example of the Bell Telephone, Index i is provided 


pon . 











BEAUTY 3 
AND OATH OF PROSTCUTING (TUAM: AMENDS 305.02, REPELS ER OHIO 50344- ) 
7 TO THE STAFE ТАА А5 atj- e* RE DISTRICT COUNTY JUDCE, MEB, 10024-85 
„Те ТНАС15 2317,021, AC I= “CLEENT PAIVILEGCD COMMUNICA OHIO 50225- 2 
и OFFICE ОҒ PROSECUTING |- » TO PROVIDE FON FILLING VAC OHIO 503%6~ 3 
[ission 10 STAIL SAR ОҒ JATTORNEYS ADMITTED IN OTHER STATES. « CONN 90682-67 
ЗА OF А SPECIAL Dt Put, ]- AND CLATAIN ASSISTANTS FRO Ille 50632- 2 
SLS UF ASSISTANT STATCS- AND CLEARS,9 AMENDS AND AE М0.0 Coill- 1 
x9LOYMENT OF AbOÍTIDnALI- BY HUMANE SOCTETIES.* AMEN OHIO 5000%- 2 
1 OF COSTS AAD RLASONARLET- FELS TO CURPORATIUN PREVAR ORE. CO549— i 
1T1ONAL. ASSESTAME STATES PUR STH JUOICTIAL CIRCUIPL® ЖА, 60910 Z 
| ALLOWANCES DF 57А7651- „= AMENDS 1183, TITLE 92/70 VEA, #0150“ 1 
ГН ADVERTISING А TDUALST ATTRACTION, SHOW OR SIMILAR PLACE OF FLA. С(0306- 2 
ЕЈ ALUE CRABS CAST DF ДОСТА RIVER AMD MARES VIDLATION А FLA, С057%- 1 
LISTS GF CAEOTIORS AND: AUCTIONS. © REPEALS AND REPLACES 42А- COMM P052^-DX 
| aini trinus ӨҢ CHCE CF AUDIT IN CERTAIN CANES ANN RERNFFIHE pra 57992 5 
20274! MUNLCIAAL ШОШ, яс: AF ACCUUMTS, Tu REVISE MROVITI ki 504222 $ 
! QUIRED TO SHAMIY АНАА —- REPORT TO INCLUDE MUNICIPALITI Jii. 50422“ 6 
| ME PONCHS 450 ОШТ1Е$ OFJAUDITIMNG COMMETTELS.© AMENDS MANY SE VER. POLBI-AX 
t ENDS. 21.1906), АЕ 574ТЕ - DEPT., ТО PROVIOE FOR РАТМЕ FLA. СО5)%-= 1 
1860 TO HE РКСУСМТЕО TO; AUDITOR FOR CHANGES (М ТАЯ ROLLS PAL MO.D СО114- L 
IC AMENDS 17, CM. 15+]- OF PUBLIC ACCOUNTS, TO REITE 16. $04)0~ 2 
WANCIAL INSTITUTIONS TO|- OF PUBLIC ACCOUNTS,* AMENDS TLL. $1016 X 
f NANCIAL. INSTITUTIONS TO! - Of PUBLIC ACCOUMTS.e AMENDS ILL. S10LT™ 4 
‹ 0% WORDIUG THE OUTY OF,- TO кесе ACCOUNT IM ӘРЕСІРІЕО Ith. $0430- 2 
i CERTIFICO fO THE Stare - 24 AMENDS 2949.14, REPEALS EX OHIO 509424 2 
F PUXANCIAL RECOROS АМО. AUDETS OF POLITICAL SURDIVS.e AMEMOS  Н0.0 (9094- 2 
| AND £XPFW5ES OF SPFCUAL, — em AMENDS 21.19(%), ML STATE A PLA. (05%-1 
j fuse МОТАҒАП 0 ҒОЛ AUSTINE SCHOOL DF RAATTLONOKO FOR CO. VER: POLAS» | 
* TO UNTALN RAITICA зона AUTHIRI ZATE UNS FRUM DENTIST ERPLOYEA CONN Posso- 2 
l OGY THROUGHOUT CH. FROM AUTO TRANSPORTATION COS. ANO BROKERS FLA. (С0846- 2 
7 BJECT TO PROWESIONS FOR AUTOMATIC INCREASES АРТСА RLTIRENENT liL. — $09255 % 
| АССОТУ TO BE COUXIED BY, - TADULATINC MACHIMLS.* AMEN ONIO Н0714- 3 
| 5 MANAGERCHI PROCEDUAES AUTOPATICALLY EFFECTIVE DURING NATIO CONN POSSI~ 2 
; ENT BY VOLUSTCER POLICE AUXILIARY FORCE MEMBERS = REPEALS АМ CONN 80552-11 
p VOLUNTEER STATE POLICE, - FURCE.= RCPEALS ANO REPLAC CONN | РО0618- 2 
, AMO KOTOR VEHICLES WITH - FUEL TANKS TO OELETE АРР Г VER. 90189-11 
‚ CNT SYSTEM, TO REDEFINE AVERAGE FINAL COMPENSATION, REVISE L VER, Р0185- 7 
; FOR APPOLNTMLAT OF PLA. AVIATION 57007 AMO ADVISORY COMMISSE FLA. COS92~ 3 
1 E STATE IN THE FIELD OF, - TO THE 1965 LEGISLATURE.» C FLA, С0492- у 
| 2 secse. сн. 15 1/2. КЕ - + TO PRESCRIBE PROCLOURE FOR Tii. 50+26~ 3 
t 013104 TO FACILITIES AT, AVOM PARK. © ADOS SUBSCC. TO 396.031, FLA. (05453-11 
JUORC 04 MACISTAOATE 10 AWAIT ARREST ON REQUISITION, TO СЕЕ ORE. COSSO~ 1 
| Е бОААО OF LOUCATION ТО AWARD AIO TO NORWICH TOWN SCHOOL 015 VER. юРО21У-®7ў 
APPLICANTS ANO MAXIMUM — -> ESTABLISHES STATE SCHOLARSH{ CONN РОЗа- 2 
2 REMEHTS AS TO MUMBCR ОҒ AWARDS АМО PRIORITY.» AEPEALS ANO АЕ CONN 90841-01 
Se FROM GIVING ARTICLES AWAY BY LOT UNDER CCRTAIN CINCUNSTAN FLA. CO55)- 2 
|CES. PENALTY FOR CARRYIAG - PARTS OF ATACRAFT INVOLVED IN С ILL. 50067- 1 
1 WITH SINGLE ANO FANDEN AXLE ANO MULTIAXLE MOTOR VEHICLE COM УҒА. Р020%- 9 
b | 
B 
| RAESTCO PERSONS WITHOUT BAIL." REPEALS AND REPLACES 6-70, AE CONN PObS2=GY 
; THOT SPECIFIED SALES DF BAKERY PRODUCTS FROM COMPLIANCE WITH WIS. ноте“ 1 
1 ЕХІСНО ТІНЕ UNLXPENQLO BALANCE FROM CERTAIN AERONAUTICAL АР мын (02%% 1 
e RE USES OF UNLXPENDEO ->` ОҒ APPROPRIATIONS OTHER THAM FLA. (09%% 2 
i 5, ETC. ТО FILE ANNUAL = SHEET MITH COMPTROLLER AND А КА. Соке) L 
| AIDE REQUIRCAUNTS А3 ТО BALLOT LABCLS ANU WAITE-IN CANDIDATE CANN Р0401- 2 
` QLEQURES FOR ADDITIONAL BALLUT ING АМО RECESSEO ELECTIONS IM VER. РО231- + 
OR TOWN COMNITICCHUR OM BALLUTS АМО PRESCRIAES RCOUIREALNTS CONN  PO50)- 3 
| PLICATIONS FOR ABSCHTEE – IN ELECTIONS, TO AUTHOAIZC А VIR. РО189- 1 
АЛАС COURTING ABSENTEE - Ом FLECTION DAY WHERE VOT[Mu O10 Н0693- 2 
2» D€SICA AND CONTENT OF - TO BE COUMTEO BY AUTCMATIC Т OHIO но71%- 3 
j WED TO VOTE BY AMSEWTEE - .” AMENOS 101.%91(1) TO ADD 5 FLA. CO484— 1 
` HIBITING MOLOING MASKED BALLS AND PROVIOING PENALTICS.= ALPE MASS. (0195-01 
1 RPORATION ANO CUSTODIAN BANM AND PRESCRIBE REQUIREMLNTS AND — OHIO $0354~ 5 
SAVINGS DEPTS. OF STATE - AND TRUST COS. FROM TAXATIUM UM CONN PO5il- 1 
1. 39%, 502a, TITLE $, RE = AWO TRUST С05., TO FURTHER PRES VER. Р01178- Z 
SURAKCL DA EACH SAVINGS – AS THE OTHER.* REPCALS AND REPL CONN Р0489-АХ 
| бан UF CORMERCTAL PAPER = COLLECTIONS, SECURITY INTERESTS MASS С0188- f 
, DEFINE DUTIES ОР DEPUTY: - COMMISSIONER," STAIKES ANO ЖЕР, NEWH СО267- 2 
| ENT fETu£CN THE SAVINGS, - LIFE INSURANCE СО, AS | PARTY М CONN РОФаФАК 
| RANCE FUND WITH SAVINGS: – LIFE INSURANCE CU, , MAKE ADMIN] CONN  PO«a9-ÀX 
36-142 10 MERGE SAYINGS – LIFC INSURANCE FUND WITH SAVING CONN  PO489-AX 
1 -142 MECULATING SAVINGS: - LIFE INSURANCE,» REPEALS SUBSEC CUNN — POABS-BX 
| REASE LIMITS OF CERTAIN. ~ LOANS ON REAL ESTATE.« АМЕМ05 А NO.O (009%- 2 
ALD SERVICE CHARGES DF: – OR OTHER PAYING АССМТ.е AMENDS OHIO Н0522- 2 
э" AMCNOS 817, VITLE б, ВАМК МС, TO REVISE REQUIREMENTS АМО VER. POZO2~ | 
| DVISTONZ AS TO SEPARATE: BANKS AND кано Ем AEPLALS АМО CONN 9052%-СХ 
f COS. бинср BY SAYINGS! = ANO Savi DEPTS. ОР STATE ВА CONN 90511-11 
CES 26-534111, RE STATE - ANO TRUST CUS., TO REPLACE CNA CONN POe42-PY 
| EPGSITS AND ACCOUNTS IN - AMD TRUST COS. IN NAMES OF 2 0 FLA, COAN- 2 
+ CPOSIT BONKS OF SAVINGS = AMO TRUST COS.* AMENDS 817, ТЇ VER. P0202- | 
' AWS FOR ORGANIZATION OF, - AMO TRUST COS. ANO REDEFINE ТЕ MO.D CO09)- є 
, TOKAL POWERS ОҒ SAVINGS- EMPLOYEES RETIREMENT ASSOCIATI MASS (015% 1 
DA POWERS OF CCOPERATIVE - EMPLOYEES RETIREMENT ASSOCIATI MASS COIS&- 1 
'А1мА$, ЕТС., ON OR NEAR) = OF SUCH PUBLIC WATERS.» AMENDS TLL. 51120- 5 
LINAS, ETC., DN OR NEAR) ~ Of WATERS OR WATER COURSES WHI ILL. 51119 2 
1 MADE IM TRUST AT STATE - 4% REPEALS AND REPLACES 36-110 CONN 90817- 2 
| PORATIONS AND CUSTOOTAM,- s DELETE REFEAENCES TO PROPER F CONN  P0526-EX 
t $i PEES GF THE 0175. ofl- e SECURITIES ANO THE INDUSTRIAL OHIO  HO9AT- X 
„НО REEMACTS 4-03-14, 26:- e TO RCVISE REGULATIONS GOVERNI М0.0 С0097- 2 
110 LEGAL INVESTMENTS OF - e TRUST COSes ETC., AND PRCSCRI CONN Р0601-88 
, ADMISSION TO THE STATE BAR А5 AM ATTORNEY,» AE DISTAICT COU ALB. 10024-85 
“MENTS FOR SEPARATION ОР'- FROM ROOMS FOR FOOD SERVICE AND — NO.D СОО92- I 
ANO ADMISSION To STATE - OF ATTORNEYS ADMITTED IN OTHER $ COMM Р0042-ЕҮ 
ANSTALLMENT CONTRACT А.- 10 RECOVERY OF CERTAIN CHARGES,» FLA. С0581-85 
"a RE QUALIFICATIONS FOR BARBER APPRINTICES PERMITS, TO RAISE WIS, Н0185- 1 
OR BARBERS, TO INCREASE - COMMITTEE FROM у TO 5.= AREND 10. 50450- 2 
414.223. TO PROVIDE FOR ~ SHOP REGISTRATION, ESTABLISH — FLA, (0483- 2 
ER TO CLOSE ОА SAWITIZE - SHOPS OA SCHQOLS$,* REPEALS АМ CONN Р0647-ЖҮ 
МІН AND REGULATION OF DARRERS, TO INCREASE BARBER СОМА ТТЕ ILL. $0450~ 2 
MSING ANO RECULATION OF,- » TO PROVIDE FOR EXPIRATION O iLL. 504+% 3 
,ROMIATTILON GF MINORS 14 BARS, TO MAKE TECHMICAL СНАМСЕ.- AME MO,D COO92- 1 
1 2 Ыры өтүнүн Of ;SASIC COMPENSATION TO INCLUDE CEATAL ILL. $51098— 4. 
чо. 2. ; con: 15 as ELIGIBILITY #04 ;- RETIREMENT PLAN. © REPEALS AMD CORN 90642-ҒҰ 
Fre. 2 Sample from Current State Legislation: ‘ge [MTO THE GREAT LAKES BASIN COMPACT AND TO ERPOWER AMD PRO OHIO Н0415- * 
An Index Using Darkened Title Fragments ‚э. OUSCRIATMATION OM THE BASIS. GF NACE, RELIGION, MATIONAL OR VER, POLSA- 2 
| 
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ORETICAL 1 ELECTRICAL CONDUCTIVITY IN 
MAGNETIC PROPERTIES OF FERROMAGNETIC CHROMIUM 
CH ON THE CHEMISTRY ANO SINGLE CRYSTAL GROWTH OF MAGNETIC 
E SPINELS WITH ADMIXTIURES OF LITHIUM, TITANIUM AND COPPER 
YSTEM IRON(2)-O0^ SINGLE CRYSTAL GROWTH OF THE GARNET TYPE 
/GLASS. PART-L IRON IN SODIUM BORATE GLASSES MELTED UNDER 
ERRITE. - 2 STRUCTURE AND 
ERRITE. ( X-RAY; THERMAL, CHEMICAL ANALYSI/ STRUCTURE AND 
ER FERRITE. ` INFLUENCE OF MANGANESE AND 
CRYSTAL DISTORTION OF IRON- 

BODY-CENTRED CUBIC TRANSITION METAL/ OBSERVATIONS ON THE 

STOICHIOMETRY. MAGNETIC SUSCEPTIBILITIES IN THE TUNGSTEN- 
том. HIGH TEMPERATURE SUSCEPTIBILITIES ОҒ MANGANESE- 
7 TETRAGONALLY DISTORTED | TRIVALENT MANGANESE) (.DIVALENT 
F THE SYSTEM IRON-5 LITHIUM OXYGEN-8 TITANIUM-5 LTTHIUM-4 


MAGNETIC DOMAIN OBSERVATION OF BARIUM- JRON-12- 


R A SINGLE CRYSTAL OF THULIUM ORTHOFERRITE ( THULIUM IRON 
ER EFFECT. MAGNETIC ANOMALY IN IRON TITANIUM 
THE SYSTEM MAGNES[UM-" OXYGEN- MANGANESE~ OXYGEN- IRON-2- 
RITE SINGLE CRYSTALS OF THE SYSTEM MANGANESE-X IHRON-(3-X) 
СЕ SPACING CU/ INVESTIGATION OF THE SYSTEM IRON-5 LITHIUM 
/FERRIMAGNETIC RESONANCE IN POLYCRYSTALLINE ( YTTXIUM(21- 
121- 0ХҮбЕҢІЗ))13)- IRONLI)-, OXYGENE3) (5-X)- ( INDIUM(2I- 
JIN POLYCRYSTALLINE ( YTTRÍUHI2)- OXYGEN(3)10(3)- ІКОМЦ1)- 
TERING METHOD. ( 21МС10.75%- MANGANESE(O.38]- IRONLI.8T)- 
FERROMAGNETIC CRYSTALLINE ANISOTROPY OF ME- IHKONI21—- 
CRYS/ OBSERVATIONS OF DE-HAAS-VAM-ALPHEN OSCILLATIONS IN 
RVES ОҒ АҢ ANTIFERROMAGNETIC ON THE ISING MODEL FOR LOOSE 
SOME MAGNETIC PROPERTIES OF THE ISING/ APPLICATION OF THE 
ELECTRON 
RON ) MAGNETIC VISCOSITY DUE TO SOLUTE АТОМ 
INDEPENDENT PARAMAGHETISM ON SUSCEPTIBILIT)ES OF NICKEL, 
FERROMAGNETISM IN DILUTE SOLUTIONS OF COBALT IN 
Сави HALL EFFECT IN FERROMAGNETICS. ( NICKEL- 
қ SPIN WAVE RESONANCE AN 

HE CADMIUM 10010Е STRUCTURE. 
t) MAGNETIC COUPLING IN 


5 OF THE BETA(11SUB2 PHASE OF THE COMPOUND MANGANESE(381— 
. PROPOSED EXPERIMENTAL SEPARATION ОҒ THE 
THE SPIN ABSORPTION IN VARIOUS PARAMAGNETIC SALTS Ià 
MAGNETOSTATIC MODES IN A MICROWAVE MAGNETIC FIELD APPLIED 

OPIUM. [ SUSCEPTIBILITY. 300 TO 1000K } ү 
FERROMAGNE ТІС AND 
DE OF THE HYPERFINE SPLITTING OF DYSPROSIUM-161. LEVELS IN 
SPIN-LATTICE RELAXATION IN FERRITES WITH 
SUSTIVE ANOMALIES. ( [IMON, CO/ THE SPECTROSCOPIC STATE OF 
INELASTIC SCATTERING OF THERMAL NEUTRONS BY 
MAGNETIC COOLING WITH 
SPIN- ACOUSTIC RESONANCE IN 
SPECIFIC HEAT OF SUPER- 
THE ROLE ОҒ PHONONS IN 

CKEL IN SAPPHIRE. 

LANTHANUM- CHROMIUM- OXYGENU3 « ELECTRON 
CTION BANDS. ( THEORETICAL } ELECTRON 
RETICAL } DIRECT FIELD EFFECTS IN ELECTRON 
NS OF THE TRON GROUP, ( THEORETIC/ THEORY OF THE ACOUSTIC 
ONOCRYSTAL, 4.2 TO 300K , - 

D CHROMITES .) INVESTIGATION Of FERRI-AND 
MODEL. TEMPERATUAE DEPENDENCE OF 
RANSITION DELTA-M SUB-S PLUS OR MINUS 2 / ON THE ELECTRON 
aL) THE TEMPERATURE DEPENDENCE OF THE SHAPE OF 
RIDE AND CADMIUM BROMIDE. ( THEORETICAL ) 

MAGNESIUM CHLORIDE. CUBIC CONTRIBUTION TO ELECTRON 
STRONTIUM SULFIDE. ( SPIN- LATTICE RELAXATION TIME, HAMI/ 


* S ASSOCIATEO WITH IMPURITIES IN ALKALI CHLORIOES. 


S IN THE CUBIC CRYSTAL FIELD/ RELAXATION PHENOMENA IN THE 
PARAMETER ) 
N IN A MAGNESIUM OXIDE CRYSTAL UNDER UNIAXIAL PRESSURE. 
1N KAOLINITE, ' Ji 
NS. THEORY OF 
YTTRIUM OXIDE. * 
N STRONTIUM TITANATE. 
N ALUMINUM OXIDE. i 
YLAMINE ALUMINUM. ( GROUND LEVEL SPLITTING ) ZERO-FIELD 
HEORAL.AND TETRAHEDRAL SITES IN YTTRIUM GALLIUM GARNET A/ 
CORUNDUM. . ELECTRON 
OPED CALCIUM- FLUORINE(2) „АМО YTTERBIUM-DOPED CALCIUM- / 
/ GARNET AND YTTRIUM ALUMINUM GARNET AND INVESTIGATION ОҒ 
ILUTE' CRYSTALS./ ANISOTROPIC BROADING OF LINEWIDTH’ ІН. THE 
ROUP IONS IN CUBI/ EFFECTS OF HYDROSTATIC PRESSURE ON THE 
46 IN THE SALTS ОҒ COBALT AND AN ANALYSIS OF THE ELECTRON 
ENT VANADIUM ION IN RUTILE {TITANIUM DIOXIDE). ELECTRON 
THE EFFECT ОҒ SHORT-AANGE ORDER ОМ 
DOUSLE QUANTUM TRANSITIONS IN 
ОҒ UNRESOLVED STRUCTURES ON THE LINE WIDTH IN ELECTRONIC 
IBILITY VERSUS FIELD } THE SPIN ABSORPTION IN VARIQUS 
—EARTHS LINE-wIDIH IN THE 
MEE POWER TRANSFER BETWEEN 
3. POWER TRANSFER BETWEEN 
ERRITES. TEMPERATURE DEPENOENCE OF THE 
TALS. ( CRITICAL TEMPERATURES } SUPERCONDUCTIVITY AND 
IGH TEMPERATURES, ( SOLID AND MOLTEN ) 
70Р 5-0 EXCHANGE INTERACTION ANO TEMPERATURE- INDEPENDENT 
UREMENT OF THE NUCLEAR MAGNETIC MOMENT ОҒ D/ INFLUENCE ОҒ 
PARAMAGNETIC RESONANCE OF MANGANESE. 1 SPLITTING 


;PALLADIUM-DILUTE IRON GROUP ALLOYS. 
ZATURES AND THERMOREHANENT MAGNETIZATION IN МАМСАНЕ5Е(Х)- 


OXIDES OF MANGANESE AND RELATED COMPOUNDS. | THE 
OXIDES. 
OXIDES. > 
OXIDES. /TRICAL CONDUCTIVITY. OF COBALT- MANGANES 
OXIDES, PART-2. PHASE RELATIONS IN THE TERNARY S 
OXIOLZING CONDITIONS. РАКТ—2 REDUCING CONDITIONS 
OXYGEN CONTENT OF COPPER AND COPPER- MANGANESE F 
OXYGEN CONTENT OF COPPER AND COPPER- MANGANESE F 
OXYGEN CONTENT ON TETRAGONAL DEFORMATION OF COPP 
OXYGEN ON COOLING PAST ITS-NEEL TEMPERATURE. 
OXYGEN SOLUBILITY AND MAGNETIC SUSCEPTIBILITY OF 
OXYGEN SYSTEM. STUDIES IN NON- 
OXYGEN, MANGAHESE- SELENIUM AND MANGANESt- TELUR 
ОХУСЕН)(6) OCTAHEDRA IN CUBIC MANGANESE- IRON(2/ 
OXYGEN-12. (1 LATTICE SPACING CURIE-POINT ) /ON О 
OXYGEN-19 BY FARADAY EFFECT. 

OXYGEN-3 ). MAGNETIC TORQUE CURVES РО 
OXYGEN-3 ALPHA FERRIC- OXIDE SYSTEMS BY MOESSBAU 
OXYGEN-3 SYNTHESIZED BY THE VERNEUIL METHOD, /OF 
OXYGENTrS. /ON AND THE MAGNETIC ANISOTROPY IN FER 
0ХҮСЕМ-8 TITANLUM-5 LITHIUM-4 OXYGEN-12.. ( LATTI 
OXYGENI31)1(31— IRON(LI- 0ОХҮБЕМІЗ?(5-Х)- ( INDIU/ 
ОХУСЕМ 3) (0х). { GARNET ) /YCRYSTALLINE ( YTTRIUM 
OXYGENI3) (5-X1- ( INDIUM(2)- OXYGENIS) {K}. C GAZ 
OXYGEN(4}} /NITE MONOCRYSTAL BY THE NEUTRON SCAT 
OXYGENIS1— MAGNETITE FERRITE 50110 SOLUTIONS. 
P-TYPE LEAD TELLURIDE. ( VARIOUS ORIENTATIONS OF 
PACKED LATTICES. 4 THEORETICAL } CAITICAL CU 
PADE APPROXIMANT METHOD TO THE INVESTIGATION OF 
PAIRS IN THE, THEORY OF SUPERCONDUCTIVITY. | 

PAIRS PART-2. EXPERIMENTAL RESULTS. ( SILICON- I 
PALLADIUM AND PLATINUM. /ACTION AND TEMPERATURE 
PALLADIUM. 

PALLADIUM, IRON- COBALT ) ~ 
PALLADIUM- NICKEL ALLOY FILMS. 

PALLAOIUM- TELLURIUM(2), А SUPLRCONDUCTOM WITH T 
| THEORETICA 
PALLADIUM(1Q00-X) ALLOYS WITH X BETWEEN 34 AMD 4/ 
PALLADIUM(62). MAGNETIC PKOPERTIE 
PARA- AND DIA- MAGNETISM UF CONDUCTION ELECTRONS 
PARALLEL FIELOS. ( SUSCEPTIBILITY VERSUS FIELD ) 
PARALLEL TO THE OC FIELO. { THEORETICAL ) / AND 
PARAMAGNETIC BEHAVIOR OF METALLIC CERIUM AND EUR 
PARAMAGNETIC CURIE POINTS. ( THEORETICAL ) 


RESEAR 


PARAMAGNETIC DYSPROSIUM OXIDE. /6 OF THE MAGNITU. 


PARAMAGNETIC IMPURITIES. ( THEORETICAL } 
PARAMAGNETIC IONS IN DILUTE ALLOYS EXHIBITING RE 
PARAMAGNETIC IONS. ( THEORETICAL ) 

PARAMAGNETIC METALS. 

PARAMAGNETIC METALS, ( THEORETICAL ) 
PARAMAGNETIC PARTICLES. ( THEORETICAL ) 
PARAMAGNETIC RELAXATION. 


PARAMAGNETIC RESONANCE ABSORPTION OF DIVALENT NI. 


PARAMAGNETIC RESONANCE AND ANTIFERROMAGNETISM IN 
PARAMAGNETIC RESONANCE FOR METALS aITH TWO CONDU 
PARAMAGNETIC RESONANCE HYPERFINE SPECTRA. ( THEO 
PARAMAGNETIC RESONANCE IN CRYSTALS CONTAINING 10 
PARAMAGNETIC RESONANCE IN METALLIC ALUMINUM. ( M 
PARAMAGNETIC RESONANCE IN SPINELS. { FERRITES-AN 
PARAMAGNETIC RESONANCE IN THE CASE OF THE ISING 
PARAMAGNETIC RESONANCE LINE CORRESPONDING TO A T 
PARAMAGNETIC RESONANCE LINES. PART-2 ( THEORETIC 
PARAMAGNETIC RESONANCE OF COBALT IN CADMIUM CHLO 
PARAMAGNETIC RESONANCE OF DIVALENT MANGANESE IN 
PARAMAGNETIC RESONANCE OF DIVALENT MANGANESE IN 
PARAMAGNETIC RESONANCE ОҒ DIVALENT MANGANESE ION 
PARAMAGNETIC RESONANCE OF DIVALENT MANGANESE ION 
PARAMAGNETIC RESONANCE OF MANGANESE. ( SPLITTING 
PARAMAGNETIC RESONANCE OF THE DIVALENT NICKEL 710 
PARAMAGNETIC RESONANCE OF THE TRIVALENT IRON ION 
PARAMAGNETIC RESONANCE OF TITANIUM ANO COBALT [0 
PARAMAGNETIC RESONANCE OF TRIVALENT CHROMIUM IN 
PARAMAGNETIC RESONANCE OF TRIVALENT GADOLINIUM I 
PARAMAGNETIC RESONANCE OF TRIVALENT -GADOLINIUM I 
PARAMAGNETIC RESONANCE OF TRIVALENT IRON IN METH 
PARAMAGNETIC RESONANCE OF TRIVALENT IRON IN ОСТА 
PARAMAGNETIC RESONANCE OF TRIVALENT VANADIUM IN 
PARAMAGNETIC RESONANCE OF X-IARADIATED THALIUM-D 
PARAMAGNETIC RESONANCE SPECTRA OF FOUR RARE-EAR/ 
PARAMAGNETIC RESONANCE SPECTRA OF MAGNETICALLY D 
PARAMAGNETIC RESONANCE SPECTRA OF SEVERAL IRON © 
PARAMAGNETIC RESONANCE SPECTRUM OF DIVALENT COB/ 
PARAMAGNETIC RESOMANCE SPECTRUM OF THE QUADRIVAL 
PARAMAGNETIC RESONANCE. 

PARAMAGNETIC RESONANCE. ( THEORETICAL ) 
PARAMAGNETIC RESONANCE. ( THEORETICAL | EFFECT 
PARAMAGNETIC SALIS IN PARALLEL FIELDS. ( SUSCEPT 
PARAMAGNETIC SCATTERING ОҒ SLOW NEUTRONS BY RARE 
PARAMAGNETIC SPINS AND CRYSTAL LATTICE. РАЯТ-2. 
PARAMAGNETIC SPINS AND THE CRYSTAL LATTICE PART- 
PARAMAGNETIC SUSCEPTIBILITY OF COPPER- CADMIUM F 
PARAMAGNETISM IN COMPLEX PHASES OF TRANSITION ME 
PARAMAGNETISM IN MANGANESE- ANTIMONY ALLOYS AT H 
PARAMACNETISM ON SUSCEPTIBILITIES OF NICKEL, PA/ 
.PARAMAGNETISM ON THE DIRECTION CORRELATION. MEAS 
PARAMETER ) 


Fra. 3. Sample from Bell Telephone Index: Ал Index Using 120 Characters per Line 
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16-014 
04-020 
18-008 
16-018 
01-049 
04-083 
01-031 


16-006 


01-032 
01-035 
03-057 
14-016 
14-015 
01-029 
04-075 
04-085 
04-080 
04-081 
18-009 
04-071 
04-075 
11-058 
11-058 
11-058 
01-021 
04-031 
14-004 
06-015 
02-020 


. 15-084 


03-030 
14-019 
05-027 
09-013 
11-064 
15-015 
05-001 
06-037 
06-038 
14-927 
12-037 
11-029 
14-007 
02-052 
13-007 
12-060 
12-019 
01-014 
14-006 
12-041 
16-043 
12-044 
12-032 
12-014 
12-058 
12-055 
12-042 
12-033 
11-037 
12-056 
12-051 
12—061 
12-008 
12-025 
12-026 
12-023 
12-029 
12-030 
12-031 
12-015 
12-053 
12-012 
12-002 
12-003 
12-021 
12-016 
12-009 
12—005 
12-006 
12-004 
12-017 
12-050 
12-062 
12-059 
12-056 
12-052 


712-037 


01-017 
12-036 
12-039 
04-076 
15-020 
14-023 
14-019 
14-009 
12-030 


Many users seek to increase the effective use of the 
index by taking the keyword out of context, forming & 
keyword-out-of-context (КҰОС) index. An example of a 
KWOC index is provided in Fig. 4. 

А number of other indexes, although they are not called 
KWOC indexes, list the keyword out of context. Ап IBM 
index similar to the KWOC index is illustrated in Fig. 5. 

Wolfe of IBM has produced innovations in program- 
ming that provide & keyword-out-of-context index with 
the full title in its natural order. The index, which is 
called а KWIC index, is illustrated in Fig. 6. 

Тһе KWOC indexes are very popular. In addition to 
the KWOC indexes illustrated above, the following are 
KWOC indexes: Keyword Titles, published by the Office 
of Technical Services; Scientific and Technical Aerospace 


` Reports, published by the National Space Aeronautics 


and Space Association; and International Aerospace Ab- 
stracts, published by the American Institute of Aeronau- 
ties and Astronautics. However, as Youden states, KWOC 
indexes “make the search for multiword phrases ... 
much more difficult.” 49 The index illustrated in Fig. 7 is 
offered as an improved version of the index illustrated in 
Fig. 6, differing from it in that it leaves the keyword in 
context, attempts a full citation, and combines an author 
index. 


2, Variations of the code. 


The alphanumeric eode in the KWIC index line is com- 
posed of different elements, according to the special re- 
quirements of the index. The Luhn code, the first to be 
used and the most commonly used, 


is derived from factual data inherent in a document as 
evinced by the publisher’s printed identification, com- 
prising the following elements: 

‚ 1. The name of the author (or senior author) or 

originating agency. 

2. The year of publication. 

3. The title of the document. 

The code comprises eleven character positions. The 
first six are derived from the name of the author or 
originating agency, the next two consist of the ten’s 
and unit digit of the year of publication, and the last 
three are derived from the title. 

The above code format was chosen over other pos- 
sible variations for the reason that when bibliographical 
entries are ordered in alphabetical sequence in accord- 
ance with this code, the utility of the resulting listing 
as an author index is not seriously impaired since the 
variations between this order and that demanded by 
the fully spelled words are slight. 


An early departure from this code was made by the 
editors of Chemical Abstracts in 1962. Reader criticism of 


А HIGH INTENSITY NANOSECOND PULSED VAN OE GRAAFF ACCELERATOR 


ACCELERATOR 

HIT LAR NUCLEAR SCI PREPRINT Ot 42 
ACCURACY PARAMCTRIC: ACCURACY STUDY OF A PREVIOUSLY PUBLISHED DECAY DAFPING RFLATIONSHI? 

GIANNINI 5* 
acto CORRUSION AND PASSIVITY OF MOLYBDENUM-NICKEL ALLOYS [IN HYDROCHLORIC ACIO 

MIT ЏЕРТ METALLURGY PREPRINT 153 42 
ACTIVITIES PROGRESS КЕРОКТ OF ТНЕ RESEARCH АМО EDUCATION ACTIVITIFS IN MACHINE COMPUTATION BY IHF COOPERATING 


CULLEGES OF NeW ENGLAND BEY СОНРОТАТ- [0ч CIR PAOURESS REPT 11 6? 
HEALTH PHYSICS ADTIVITIES 
GENERAL DYNAMICS NARF ^2-18T 63 


AEAUSBALL ISTIC RESULTS UF DETAILED FLOW FIELU АМО RATE CHEMISTRY CALCULATIONS ON AN ALROBALLISTIC PELLIT 


GENERAL APPLIED SCI LAG GASL 14-292 62 
AEROSPACE ALROSPACE GROUND EQUIPMENT PRESENTATION РОА DOUGLAS MISSILE AND SPACE DIVISION 
LOCKHEED MISS AND SPACE DIV 63 
AIR IMPHOVED КАТС CHEMISTRY PROGRAM FOR ONE-DIMENSIONAL ІМУІ5СІ0 AIR FLOW WITH PRESCRIHEO PRESSURE 
VARIATIONS GENERAL APPLIED SC( LAB GASL Га- 266 82 
CHEMICAL RELAXATION [N AIR, OXYGEN AND NITROGEN 
INST.OF AERON SCI PREPRINT 802 5а 
ALGUAITHE STATE ASSIGNMENT ALGURITHM FOR CLOCKEU SEQUENTIAL MACHINES 
MIT LINCOLN LAB TR 272 52 


ALGOAITH* IC ЕШАТНЕА SYLLABIC STUDIES FOR ALGORITHMIC PREDICTION OF ENGLISH PARTS OF SPEECH. АМ ANNOTATED 


BIBLIOGRAPHY LOCKHEEO MISS AND SPACE CO 6% 
AX ALGORITHMIC THEORY Uf LANGUAGE 
MIT DEPT ELECTRICAL ENGNG  ESL-TM-TS6 52 
ALLE SANY ALLEGANY BALLISTIC LABORATORY DEVELOPMENT PROGRESS REPORT 
HERCULES PONDER CO DEV SITs “ 
ALLEGANY BALLISTICS LABORATORY DEVELOPMENT PROGRESS REPORT 
HERCULES POWOER СП DEY 5086 62 
ALLEGANY BALLISTICS LABORATORY DEVELOPMENT PRUGRESS REPORT 
HERCULES POWDER CO DEY 5035 82 
ALLEGANY BALLISTICS LABORATORY ANNUAL RESEARCH REPORT 
А HERCULESE POWDER СО  ABL/X-90 5% 
ALLOY ANNEALING ОР THL ORDEREO AND DISORDEREU ALLOY CUSAU AFTER COLO WORK 
MIT DEPT METALLURGY PAFPRINT 12% 42 
5 ALLOYS HIGH-FIELU CAPABILITIES OF HIGH-ZIRCONIUM NB-ZR SUPERCONDUCTING ALLOYS 


SIF DEPT HETALLURGY PREPRINT 143 62 


Ела. 4. Sample from Douglas Missiles and Space Library Index: А KWOC Index 
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QUU Gf Rut deca E а Oe 


Part 1 
ELECTRIC , А STUDY OF THE ACTIONS OF DIELECTRICS UNDER . 7241 
С APPLIED ALTERNATING ELECTRIC STRESSES WITH REGARD 
ТОЎТНЕ LOSS ОҒ ENERGY WHICH OCCURS. ^ 
ELECTRIC THE TIME LAG OF THE ELECTRIC SPARK. 7296 
ELECTRIC CHANGES IN THE X-RAY DIFFRACTION PATTERN OF | 735 
NITROBENZENE PRODUCED BY AN ELECTRIC FIELD 
' CHANGES IN TEMPERATURE AND CIRCULATION, ux 
ELECTRIC THE RELAXATION BETWEEN THE SPECIFIC INDUCTIVE 7528 . 
mee * CAPACITY OF AN ELECTROLYTE AND THE ELECTRIC 
POTENTIAL OF AM ELECTRODE PLACED IN IT. 
ELECTRIC INTRINSIC ELECTRIC FIELOS ON SURFACE OF COPPER 7892 
SINGLE CRYSTALS. , 
ELECTRIC . ELECTRIC CONDUCTION AND ТОМ DIFFUSION IH 'GLASSe 7310 
ELECTRIC POTENTIAL OF SYSTEMS OF ELECTRIC CHARGES. 7584 
ELECTRIC ELCCTRIC FIELD MEASUREMENTS IN GLOW DISCHARGES 7639 
95186 A REFINED ELECTRON ~ BEAM TECHNIQUES. 
ELECTRIC А SYSTEMATIC STUDY OF ELECTRIC WAVE VIBRATORS AND 7686. 
RESONATORS. 
. ELECTRIC A PHOTOGRAPHIC AND VISUAL STUDY OF THE EARLY 7828 
' "STAGES OF ELECTRIC SPARK DISCHARGES. Е 
ELECTRIC THE MAGNETIC EFFECT OF ELECTRIC DISPLACEMENT, 7840 
ELECTRIC А STUDY OP MULTIPLE REFLECTIONS ОҒ SHORT ELECTRIC 8020 
WAVES BETWEEN TWO OR, MORE REFLECTING SURFACES, 
ELECTRIC THE EFFECT OF IMPERFECTIONS ON ELECTRIC 8111 
BREAKDOWN PHENOMENA IN POTASSIUH BROMIDE. 
ELECTRIC THE ELECTRIC MOMENT OF GASEOUS HOLECULES OF 8157. 
HALOGEN HYDRIDES, 
ELECTRIC THE SHADOWGRAPH METHOD as APPLIED TO A und or 18197 
THE ELECTRIC SPARK. 
ELECTRIC эз SEE ALSO  THERMGELECTRIC, 
ELECTRICAL ELECTRICAL PROPERTIES AND THE MATURE OF ACTIVE 1371 
NITROGEN. — . 
. ELECTRICAL OPTICAL AND ELECTRICAL PROPERTIES ОР SILVER 6015 
Й CHLORIDE. 
ELECTRICAL ELECTRICAL CONDUCTION ANO CRYSTALLIZATION 6188 
E . PHENOMENA IM THIN LEAD FILMS AT TEMPERATURES а 
BETWEEN 18 DEGREES K АМО 500 DEGREES X, 
ELECTRICAL VISCOSITY AND ELECTRICAL CONDUCTIVITY OF MOLTEN 0248 
GLASS 
ELECTRICAL ' THE DIRECTIONAL DEPENDENCE OF ELECTRICAL . 0341 
zx CONDUCTIVITY IN METALS, 
+ ELECTRICAL PHOTOELECTRIC SENSITIVITY OF METALS AT LOW 0312 
TEMPERATURES AND ELECTRICAL PROPERTIES OF 
. ы SPUTTERED FILMS. 
ELECTRICAL CRYSTAL CROWTH AND ELECTRICAL АМО OPTICAL 0421 
И ў PROPERTIES OF GRAY 1186 7 
ELECTRICAL | ELECTRICAL CONDUCTION IN ZINC SULFIDE SINGLE 051$ 
CRYSTALS. 
ELECTRICAL X-RAY» OPTICAL» AND ELECTRICAL PROPERTIES OF 0337 
BUILT = UP FILHS, | 
ELECTRICAL CALCULATION OF THE RESONANT PROPERTIES OF 0545 
Ре ELECTRICAL CAVITIES. 5 
ELECTRICAL OPTICAL ABSORPTION PHOTOCONDUCTIVITYs ELECTRICAL 0554 
у CONDUCTIVITY» AMD-HALL EFFECT IN GERMANIUM 
А HONOSULFIOE. 
"ELECTRICAL "ON THE ELECTRICAL RESISTANCE OF MERCURY AT HIGH 0580 
TEMPERATURES AND HIGH PRESSURES» AND THE CRITICAL 
й POINT OF MERCURY. 
ELECTRICAL REFLECTION AND REFRACTION OF ELECTRICAL WAVES ev 0625 
SCREENS OF RESONATORS AMD BY GRIDS. d 
ELECTRICAL ELECTRICAL PROPERTIES ОҒ EVAPORATED. CARBOM FILMS. 0687 
ELECTRICAL THE ELECTRICAL PROPERTIES OF LEAD TELLURIDE FILMSe 0677 
ELECTRICAL THE INFLUENCE OF LIGHT ON THE ELECTRICAL 0719 
RESISTANCE OF METALS. , 
ELECTRICAL THE ELECTRICAL PROPERTIES OF TELLURIUM, 0720 
ELECTRICAL АҢ INVESTIGATION OF CERTAIN ELECTRICAL PROPERTIES — 0843 | 
i OF OXIDE — COATED САТНООЁ5» 
ELECTRICAL ION FORMATION AMD DECAY IN A MERCURY RESONANCE 0916 
| CELL AS EVIDENCEO BY ELECTRICAL IMAGE FORCES, 
ELECTRICAL PART 1. DIELECTRIC LOSSES AT RADIO FREQUENCIES IN 0911 
А LIGUIO DIELECTRICS. PART 11. THE ELECTRICAL 
PROPERTIES OF FLAMES CONTAINING SALT VAPORS FOR 
HIGH FREQUENCY ALTERNATING CURRENTS. PART 111. 
THE CONDUCTIVI TY oF FLAMES FOR RAPIOLY ALTERNATING 
; CURRENTS.» 
ELECTRICAL THE THERMAL AND ELECTRICAL CONDUCTIVITIES OF 0933 
CARBON ANO GRAPHITE AT LOW TEMPERATURES. 
ELECTRICAL THE ORIENTATION OF ELECTRICAL BREAKDOWN PATHS IN 1099 
SINGLE CRYSTALS. 
ELECTRICAL INVESTIGATIONS OF CERTAIN FREQUENCY DEPENDENT 1169 
ELECTRICAL PROPERTIES OF ?IOLOGICAL MATERIALS. 
ELECTRICAL THE THERMAL ANO ELECTRICAL CONDUCTIVITIES OF 1273 
2 LEAD ~ BISHUTH ALLOYS, , 
ELECTRICAL THERMAL AMD ELECTRICAL CONDUCTIVITIES OF TUNGSTEN 1456 
ANO TANTALUM, 
ELECTRICAL ELECTRICAL ANO OPTICAL PROPERTIES OF RUTILE 1498 . 
| SINGLE CRYSTALS. 
ELECTRICAL THE ELECTRICAL COMDUCTIVITY OF AQUEOUS SOLUTIONS 1513 
OF STRONG ELECTROLYTES AT HIGH FREQUENCIES. ` 
ELECTRICAL THE EFFECT OF BOUNDARIES OM THE ELECTRICAL 1402 
р PROPERTIES OF CERTAIN SEMICONDUCTORS. 
ELECTRICAL THE ORIENTATION OF ELECTRICAL BREAKDOWN PATHS IN 1626 
SINGLE CRYSTALS. : 
ELECTRICAL THE CATHODO ~ COMDUCTIVITY OF ZINCBLENDE, АМ 1793 
EXPERIMENTAL INVESTIGATION ОҒ THE EFFECT ОҒ ` 
“ELECTRON BOMBARDMENT ON THE ELECTRICAL 
- CONDUCTIVITY OF ZINCBLEMDE CRYSTALS. 
ELECTRICAL THE DECAY OF THE TRIPLET P LEVELS ІМ THE FIRST 1763 
Р ` EXCITED CONFIGURATION OF НЕОН DURING THE AFTERGLON 
А OF AH ELECTRICAL DISCHARGE. 
47 ELECTRICAL ELECTRICAL CONDUCTIVITY OF SINGLE CRYSTALS OF “1775 
* BARIUM OXIDE AS А FUNCTION QF TEMPERATURE AND 
* EXCESS BARIUM DENSITY, У 
1 ELECTRICAL THE USE OF ELECTRICAL PULSE TECHNIQUES ли THE 1873 
STUDY ОҒ THE MOBILITY OF GASEOUS 1045. 
ELECTRICAL THÉ ELECTRICAL PROPERTIES OF SEMICONDUCTORS 2019. 
THROUGH THE SOLID - LIQUID TRANSITIONS 
ELECTRICAL PULLING ELECTRONS OUT OF METALS BY INTENSE 2067 
42 ELECTRICAL FIELDS» 
"ÉCECTRICAL THE ELECTRICAL RESISTIVITY ОҒ COPPER ALLOYS AT LOW 2075 
TEMPERATURE, 
ELECTRICAL THE ELECTRICAL BEHAVIOR OF PLASTICALLY DEFORMED 2170 
(CONT INUED) 


“ 
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үш 5. чи from an IBM Index: А Keyword Out of 
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the code first, used led the editors to change to ап identifi- · 
eation code based on the journal citation rather than the . 
author.®°.An example is the reference code “ЈІМ'Т-0090- . 
0172,” referring to the Journal of the Institute of Metals, 
90: 172. 

The suggestion T been made that the code be duoi 


' structed so that it could be used for shelving. “If the 2 


` Campbell proposes the use of “ 


documents indexed were actually shelved and obtained Бу” 
using Luhn's code or number, then the index would re- ` 
quire only a. single look-up.” 4° Such indexes are pre-. | 
ferred by users over the double look-up index, which , 
requires the extra step of referring from the index to. 
another listing for the information necessary to find the 
desired material. It has been objected that consistency in 
codes assigned on the basis of titles would be difficult to . 
attain, but this difficulty might be overcome by a more- 
arbitrary assignment of letters. | 
Special requirements of individual indexes have led to 


. other code variations in order to increase the efficiency | 


of the indexes. Experimentation with code variation will 
undoubtedly continue. 


3. Addition of classification information 


Regardless of the success of the permuted index, voices " 
are still raised in favor of the classed system. For example, 
Guha, in a study of the arrangement of entries in a num- | 
ber of indexing periodicals, prefers a classified arrange- 
ment.* Campbell, speaking of keyword systems in gen- 
eral, agrees with Guha: 

For the user, thé disadvantages of keyword systems 

аге that he is confronted with a welter of words, which | 

may or may not include all the words in which he puts" 


his query, and that he needs, and rarely gets the "bird's : Р 
eye view." 42 


' and “see also" cross” 
references and also considers placing “the classification ' 
of keywords on.to а hinged-panel strip index, close to the. 
keyword index.” This combination of keyword index and 
classified index “has most of the advantages of clasaifica-~ 
tion without many of its difficulties.” For example, “be- 
cause the classification does not now determine the loca- 
tion of any document in a file, or entry in an index, the 
need for mutually exclusive classes and of a single place 
for each concept vanishes.” 43 i т 
"Kennedy also suggests that subject scatter, one of the 
characteristics, or shortcomings, of the permuted index, 


“сап be mitigated by cross referencing.!! Examination of 


a KWIC index that used cross references indicated that 
the-use was not as successful as it might have been be- 
cause the type face of the cross references was the same 
as that of the entries, making it difficult for the user to. 
determine which was which. However, this difficulty could ' 
probably be remedied by improvements of the type face. . 


4. Combination of author index and title index: WADEX | 


Another innovation made in the permutation index to 
achieve gréater effectiveness has been the combination of - 





ARBITRARY 
SCHEOULING WITH ARBITRARY PROFIT FUNCTIONS 


CARD 
CRITICAL PATH SCHEDULING /CARD/ 


MISS LESS MANAGEMENT INFORMATION SCHEDULING /CARO/ 


1620 LESS/LEAST-COST ESTIMATING AND SCHEDULING »-SCHEDULING PORTION /CARO/ 


CLASS 
CLASS SCHEDULING PROGRAM FOR THE 7074 AND 1401 


созт 


LEAST COST ESTIMATING + SCHEOULING-SCHEDULING PHASE ONLY 


CRITICAL 
CRITICAL PATH SCHEDULING /CARD/ 


070910861ВАҒ 


162010.3.005 


1620102432011 


162010. 3.003 


707012.9. 004 


065010632009 


162010.3.005 


Fic. 6. Sample from an Index Used at IBM: А Keyword Out of Context Index Using Full Title 


subject and author entries іп the WADEX index (word 
and author index). Since this index treats the authors’ 
names as keywords, users need search only one index, not 
two — a, convenience that has long been available in book 
indexes and library eard files, but which seems to be new 
in the field of machine indexing. Also, users who remem- 
ber papers by the names of the authors have another 
means of locating the indexed paper.?? 


5. Improvement of type face 


The type'face used in the KWIC indexes 18 an im- 
portant aspect of the readability of the indexes. Ассога- 
ing to Balz, the small size of the type, after it has been 
reduced for printing, “Bothers everybody. . ." Balz re- 
ports that several things have been tried to improve the 
type. Chemical Abstracts has an upper and lower case 
chain in use, which may improve the readability. The 
chain will increase the cost of the index slightly.** 

Use of bold face type also improves the readability of 
the indexes and would be of particular help in listing cross 
references, as mentioned above. In a machine printout, 
bold face “type” may be obtained by strikeovers. 


© Improvements in the Preparation and Use of 
KWIC 


The use of the KWIC concept and of KWIC indexes 
can be improved both by those who prepare the indexes 


A NEW PERMUTED TITLE INDEX IN THE SOCIAL SCIENCES AND THE 


HUMANITIES BY FARLEY, EARL. 
SELECTED WORDS IN FULL TITLE (SWIFT): 


A NEW PROGRAM FOR 


and by those who use the prepared indexes. Those pre- 
paring the index can improve it by obtaining better titles 
and, perhaps, by the use of a thesaurus. Those using the 
prepared indexes can improve their use by a better under- 
standing of the indexes. 


1. Improvement of titles 


Most frequently mentioned in the literature of KWIC, 
perhaps, is the need for better titles. Studies have been 
made of the reliability of using titles as a basis for per- 
mutation indexing and of various other problems in 
titling. 

Lane, in a survey of titles contained in ten periodical 
indexes, concludes that: 


in science and engineering the titles of articles usually 
describe or at least imply the contents of the articles. 
In non-technical fields titles reveal the contents less 
frequently; and in a general index such as Readers’ 
Guide titles are indicative less than half the time.?? 


Sedano’s statistical analysis of six indexes establishes a 
similar range from technical to general, with a similar 
range of title efficiency.3* Title quality, or descriptiveness, 
correlates with the specificity of its literature. 

Titles can serve two purposes: they can attract the 
attention of the reader or they can describe the subject of 
the article or publication. Also, when a secondary title is 
used with the primary title, the complete title can both 


8467397 


4563298 


COMPUTER INDEXING BY NEWBAKER, Н.К. 


ACCURACY OF TITLES IN DESCRIBING CONTENT OF BIOLOGICAL 


7582938 


SCIENCES ARTICLES BY BERNARD, JESSIE. 


TRITSCHLER, R. J. 6693451, ELECTRONIC INDUSTRIES, APRIL, 1962, 


PP 205, 207, 210. 


Fic. 7. Sample of Improved Index 
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attract the reader and describe the Subject. Àn example 


- об a combined title, with primary and secondary titles, is: 


The Shining Light: 
A Blind Girl Meets Life and Finds Happiness 
in San Francisco in Spite of Fire, Flood, Earthquake and 
an Innocent Bystander -` 


The phrase “The Shining Light” is the primary title and 
the phrase “A Blind Girl Meets Life and Finds Happiness 
-in San Francisco in Spite of Fire, Flood, Earthquake and 
an Innocent Bystander” is the secondary title. The pri- 
mary title is a catch title, the secondary title is a deserip- 
tive title with useful keywords. 
Without too much difficulty, the convention of adding 
8 descriptive secondary title containing keywords to a 
catch title could become a fixed stylization with the 
settled understanding that for permutation, only the 


secondary title would be indexed. By use of combined . 


titles, the author would be able to supply catch titles to 

attract the reader and, at the same time, to supply a 

' descriptive title with keywords for machine indexing. 

- Author participation in the writing of good titles is 
essential. The editors of Biological Abstracts approach 

the job of instructing authors, іп а somewhat negative 

way, by supplying examples of very poor titles. 


How should population surveys be made? (Looks 
fine, only you might like to know before you look up 
this paper that it deals with fish.) : 

The problems of changing beliefs and attitudes. 
(Would you guess this to be & general philosophic 
discussion? If во, you will discover instead some rather 
practical advice on the subject to leaders in wildlife 


ШУ een jet to the skin. (Aerospace biology? Guess 

again! Subject of the paper is control of blowflies on 
` Bheep.)** ' 
` Such titles could not be retrieved satisfactorily from a 
word-based index. 

Kennedy, in suggestions to authors, recommends the 
following: ' 


1. Consideration of the titlo as a one-sentence abstract 

2. Use of specifie terms 

3. Provision of enough context.to clarify the relation- 
ships between the selected keywords, but no more than 
enough | 

4. Balance of brevity and descriptive accuracy 

5. Where possible, use of words rather than characters 
or notations that cannot be duplicated on standard key- 
punching and computing equipment 

6. Filing of subjects m relation to titles to introduce 
general concepts into the word index.?® 


Herner approaches the problem of author participation 
from yet another and, ultimately, more critical direction. 
He indicates the degree of author participation already 


expected. As an example of implicit author participation, : 


“in the specifications for papers for the 1963 ADI meet- 
ing there was the following requirement: "The title must 
be composed with care and must contain at least six sig- 
nificant words. "39 As an example of explicit author 
participation, authors of papers presented at meetings of 
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the Federation of American Societies for. Experimental 
Biology were required to select indexing terms when pre- 


paring their titles. With such author participation in the : 


machine-indexing process, Herner sees the dangers of 


standardization and conformity. These dangers, however, ' 
already exist in the “core” vocabularies of the different | 
sciences and technologies. Whatever the ultimate objec- . ` 


tions to standardization and conformity may be, “Тһе 
title must permutate or perish.” 21 
The reliability of scientists in supplying titles was 


articles, each sent to twenty scientists working in the 
general field of psychology. The scientists were asked to 
title the articles; then their titles were compared with 
the authors’ titles. Comparison of words in the actual 
titles with words in titles supplied by the twenty scientista 


татат 
y EE 


- studied by Papier, using the criterion of retrievability. > 
His sampling was small, consisting of five psychology ' ` 


established а word frequency, which seems to be quite | 


high: “53% of the scientists! words were found in the 
authors’ titles, and 46% of the authors' words were found 
in the scientists’ titles." 2° "Use of a thesaurus raised the 
word frequency to 61% and 62%. 

Another point to be considered in the preparation of 
titles is that it is possible for a title that seems to be good 


to. be lost in the system and to be indexed inadequately . 


or even not at all. An illustration of this is a fictional title ` 


by Brandenberg, “An Application Oriented Explanation’ 


of Machine Models Used by the Aerospace Companies." 
Made up entirely of stopwords from a particular 
computer-indexed information system, it would never 


enter the index at all—it would be stopped by the 


machine.*1 


2. Use of a thesaurus 


Balz defines a thesaurus, as used in the field of in- o © 


formation retrieval, as “a collection of authorized subject 
headings or ‘descriptors.’ These descriptors are arranged 
according to conceptual groups and fields, accompanied 
by an alphabetic index and containing pertinent scope 
notes and cross references.” He explains further, “The 
idea of an official ‘authority list’ of subject headings or 
descriptors, however, is far from a new concept’ since 


` ‘authority lists’ have been used by catalogers in libraries 


for many years.” He goes on to define thesaurus in 
general library terms, 


The thesaurus, then, is the tool that catalogers or sub- ` | 


ject analysts use in describing the contents of docu- 
ments in order that each may describe similar content 


consistently..It is also the device by which search. е 


requests may be worded to assure а relatively high . 


degree of accuracy on retrieval. 


An example of a machine-made thesaurus may be seen 
in Fig. 8. : 

Papier is not convineed that supplying thesauri is o 
great value. Іп his study of indexing relevance, he says, 


"It can.be said that providing thesauri in the sample К 


ACARICIDES 
(PEST CONTROL AND INHIBITING : 
AGENTS). 
INCL: MITICIDES 
ALSO SEE: ANTIPEST IMPREGNANTS 
PARATHION 
PEST CONTROL 


ACCELERATION 
(MECHANICS) 
ALSO SEE: DECELERATION 


ACCELERATION INTEGRATORS USE 
ACCELEROMETERS 


ACCELERATION TOLERANCE 
(TOLERANCES) 


ACCELERATORS 
(PARTICLE ACCELERATORS) 
ALSO SEE: BETATRONS 
CYCLOTRONS 
ELECTRON ACCELERATORS 
ELECTROSTATIC ACCELERAT ORS 
ION ACCELERATORS 
| LINEAR ACCELERATORS 
PARTICLE ACCELERAT ORS 
PROTON ACCELERATORS 
| SYNCHROTRONS 


Fig. 8. Example of Machine-Made Thesaurus 


studied could only increase the first try probability from 
46% to a maximum of 62%.” 20 

Storrer, like Papier, has а reserved attitude toward the 
need and value of а, thesaurus with the KWIC index. 


'The KWIC listing when used as an index minimizes the 
need of a thesaurus in seeking references on particular 
‘topics. Of course, a thesaurus is necessary to avoid 
problems arising from the use of synonyms, variance 
in spellings, and the ambiguities of identical words with 
altogether different meanings.18 


Balz admits “that the use of a thesaurus in information 
retrieval work has not been universally agreed upon.” 
Nonetheless, he “points out the need for a controlled list 
of descriptors and the pitfalls of operating an automated 
information retrieval system using uncontrolled natural 
language.” 

| He states that, 


‚ А system that does not use а thesaurus is based on 

| the premise that words have precise meanings and that 
they do not derive any significance from context. How- 

г ever, some words do have more than one meaning, and 
thus there exist problems of interrelationships.*® 


The type of thesaurus proposed by Balz resembles a 
classed system. Specific ideas are grouped under broader 
levels so that narrow, related, subordinate concepts can 
be retrieved through broad areas. 


' The purpose of a thesaurus is to enable retrievers of 
information to describe their information needs in terms 

‘used by the originators and indexers which is a basic 
requirement in the effective documentation of informa- 
tion.** 


3. Improvement of the use of KWIC 


For effective use of KWIC indexes and similar indexes, 
the user should be aware of the special characteristics of 
these indexes and should have an appreciation of the 
essential difference between these indexes and the subject- 
heading indexes. 

The user should realize that he will need to look for 
synonyms of the words that most precisely describe the 
subject for which he is searching. For example, when 
searching for literature on the KWIC index, reference 
must be made to the term “keyword-in-context” as well 
as to "KWIC." 

The user should realize that he may need to look from 
the specific to the general when using KWIC indexes 
rather than from the general to the specific, as is the case 
when using a subject-heading index. For example, litera- 
ture on KWIC is indexed under the general terms “per- 
mutation index,” “permuted indexing,” and “indexing.” 

Also, the user must be aware of terms used in relation- 
ship to the subject for which he is searching, even though 
those terms are not in hierarchal relationship to each 
other. The ease with which a narrow subject may be 
located in the KWIC indexes should not detract the user 
from other possible leads to relevant bits of information. 
For example, reference should be made to the related 
term “titles” in a search for literature on the KWIC 
index. 

Librarians, scientists, technologists, and other pro- 
fessional people using KWIC indexes should all be aware 
of the special characteristics of the indexes. It is particu- 


. larly important for the librarian using KWIC indexes to 


have a greater awareness of special vocabularies than he 
needs when using a subject heading index. 


9 Usage of the KWIC Index 


Perhaps no aspect of the KWIC index is as contro- 
versial as its usage. The KWIC index was created to 
“cope with the problems of timeliness." 3? It was con- 
ceived of as а current awareness tool and it is primarily 
used as guch. It has been spoken of as "the best replace- 
ment for the old library bulletin, for getting things out 
to people quickly.” 4 Luhn spoke of the KWIC system 
and its usage as follows: 

(1) The principal merit of the method is timeliness. 
'The KWIC system lends itself to index production in 
the shortest possible time with & minimum of effort. 

(2) The proper objective of KWIC indexes is to 
increase among their readers an awareness of current 
research. 

(3) The usefulness of these indexes is of & temporary 
nature. Ideally, they should be superseded . . . by “ал 
instrument prepared with care in due course, incor- 
porating all those features which will enhance its use- 
fullness as а permanent tool of reference." 47 


'The semimonthly index, Chemical Titles, represents: 


a faithful adaptation of Luhn's proposal; it is a dis- 
seminating index rather than a retrieval index. Aimed 
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. at increasing ‘current awareness, each semimonthly 
' Chemical Titles is a concordance to ‘articles selected 
from some 600 periodicals. It'is not an index to 
Chemical Abstracts. Papers are likely to be listed in 
‘Chemical Titles even before their abstracts are pub- 


‘lished. It is apparent that Chemical Titles is not ` 


intended to take the place of thorough indexing-in- 


дерћ 37 , | 


` Tibrarymaster Services uses KWIC for new items an- 


` noüncement bulletins because it is "the most economical 
+ listing ...it is in effect a hybrid subject and title 
catalog.” 48 A similar permuted title system used at 
. : Bell Telephone Laboratories provides “a current aware- 


. ness bulletin with a highly useful subject structure." 39, 


Luhn, a few months after he proposed the KWIC 

' index, also -proposed an accompanying service system 

. which he named Selective Dissemination of Information 

(SDI). He explained the need for such a system in this 
"manner: 


~ 


Effective dissemination of scientific’ information has ` 


of late become the subject of major interest and con- 
cern because ofthe realization that it is apt to play а 
decisive part in the гасе for leadership in technological 
accomplishments, be it, among nations or be it among 
organizations and businesses within a nation. It is felt 
that if discoveries and new developments can promptly 
and exhaustively be brought to the attention of scien- 
tists and engineers at large, technological progress may 
be accelerated. This-feeling is obviously. born from a 
conclusion that presently existing means of scientific 
. ' communication are inadequate, a condition which has 


invited the attention of government and institutional’ 


leaders and bodies.*° s 


Luhn's method of selective dissemination of informa- 


: tion consists of a machine-eomparison of a pattern of 
keywords characterizing а new document with an in- 
.' terest profile of a user. If there is enough similarity 
"between the two, the user is notified by card. If the user 
, wishes to have the document, he returns a stub from the 
eard and the document is sent to him. SDI is the subject 
of research and experimentation by Chemieal Abstracts 
Service 5: and is in routine use at ІВМ. | 
SDI ean be considered to be an extension of the same 
‘principle that underlies the KWIC index; that is, the 


rapid dissemination of current information from source. 


to interested user. 

KWIC indexes are used ав retrospective searching tools 
as well as current awareness tools. As stated by Balz and 
~ Stanwood of IBM, “KWIC indexing allows rapid prepa- 
` ration of current awareness bulletins and at the same time 


provides a retrospective multi-aspect search facility.” 51  : 


.' Avsimilar opinion is offered by Kennedy in a description 
-of Bell Telephone Laboratories use of the permuted 

: index: " : 
Again by tape merge or punched deck re-runs, the 
production of cumulated and up-dated index volumes 


on & continuous or periodic basis, say semiannually, is 


simple and cheap. The original investment in the an- 
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nouncement bulletin is thereby exploited to provide & 
multi-aspect retrospective search facility*® = = > 
The Biological Abstracts adaptation of the KWIC . 

index, which is called BASIC (acronym for Biological . 


. Abstracts Subjects in Context), is designed “to meet not 


only a need for ‘current awareness,’ but also the need for 
a permanent reference tool.” 47 Enthusiasm for its use by 
scientists may be represented by the statement of one 
who wrote, “This is as great an advance to biology as 


. the electron microscope.” 47 


WADEX (acronym for Word and Authors Index), 


. which is an index to Applied Mechanics Reviews, “js 


intended for retrospective search rather than for current 
awareness information, though it could also be used for 
the latter purpose.” 23 | | 8 
It is interesting and significant to note that, as the use 
of the KWIC index deviates farther and farther from a 
simple current awareness bulletin, it becomes more neces- 
sary to change or adapt it to the enlarged usage of retro- 
spective search. Almost always, the. first change to be 
made is in the practice of editing, to make KWIC more 
acceptable as a permanent book index. Balz and Stanwood 
make this statement about KWIC: ! 


This system produces indexes more rapidly, accurately, 
. and with greater definition of subject content than 
Other forms of indexing; especially when provision is 
made to add descriptive words to titles that do not in 
ihemselves convey the subject content of the papers. 
they гергевепі.51 
Kennedy, of Bell Telephone Laboratories, writes: , 
There are . . . no rules in the game which say that 
human contributions to mechanized indexing are. il- 
legal! that data'to be keypunched for indexing must 
come from the title alone. Several steps for adding to 
the coverage or convenience of a permuted index can 
be taken without compromising the essential merits of ' 
mechanization. At the editorial scan stage titles which 
appear on the basis of this quick inspection to be weak 
or unclear might be supplemented, say by marking a 
word ог words in the document abstract for key- 
punching .. 13 7. 


Biological Abstracts uses "vocabulary management" 
and "supplementation" of titles in its index which is used 


‘for current awareness and for cumulation3? WADEX, 


which departs from the classic KWIC format in a number 


‘of ways, also uses editing in the manner described as 


follows: 


As the first step, the titles as they appear in the 
magazine are edited by an engineer or scientist to. 
remove or change all those features which cannot be 
properly handled by the keypunch and printout equip- 
ment. ... These titles are then keypunched. After 
verifieation, the cards are fed into the IBM 1401 
(Program A) which prints out their contents‘for an- 
other human post-editing. After corrections from this 
editing are inserted, machine processing begins.?3 


On the use of editing, Herner takes the viewpoint that: 2 


In all probability, unless the existing technology 
changes drastically, any permuted index that resembles 
& conventional book index will-be the product of either 


human operators alone or of machines working in tan- 
dem with human operators, whose job it would be to 
make sure that the machines make sense.5? : 


е The Future of KWIC Indexes and of the 
KWIC Concept 


The KWIC index may some day find a place in а 
national index that might be part of а national informa- 
tion center patterned after the center in Russia, or it 


| may find а place in regional information centers or in 
discipline-oriented centers. According to Veyette, such 


centers should concern themselves with current literature, 
and should have 


an active program of rapid, automatic, selected dis- 
semination. Á system to accomplish such & program 
could be the Selective Dissemination of Information 
system. ... Such a system would notify scientists, 
engineers, Management personnel, educators, govern- 
mental employees, and the private citizen, if desired, 
of reports, articles, and other papers of interest to him.5 


KWIC indexes, with their emphasis on currency and by 


' their link with SDI, will fit easily into such an informa- 


tion system. 

The future for KWIC includes the plans to replace 
card-punch operation with optical scanners, making the 
preparation of the index even faster than it is today. 
Also, plans for using the “Echo” satellites to link informa- 
tion centers around the world, in a worldwide drive to- 
ward immediacy in information dispersion, will surely 
provide a place for KWIC indexes and for the KWIC 
concept. 
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Pa. (Oct. 14-16.) 
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Aslib Proc. 15: 333-335. (Nov.) 
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1905. Ашотайс Indexing: A State- 

‚‚ of-the-Art Report, NBS Monograph 91, GPO. U. 8. 
` : Department of Commerce, National Bureau of Stan- 
dards, Washington, D. C. (Mar. 30.) 
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Characévisties and Use of Personel ПОТЕ Maintained by 
Scientists and Engineers in One University’ 


^" 
E 
П 


Interviews with 75 graduate school faculty members 
їп the science and engineering departments at Florida 
State University have revealed that 46 of the inter- 
iviewed faculty members maintain personal indexes. 


„ 


ons e аи Ат 


Personal indexes are organized collections of documents 
and/or homemade references to documents that the re- 
searcher keeps in his office. Information gathering habit 
studies have shown that a significant portion of research- 





;ers maintain personal indexes. Studies by Fishenden, 
| Tornudd and Hogg and Smith, for example, have brought 


! out the fact that 45% (1), 57% (2) and 66% (8), respec- 


tively, of surveyed scientists had and/or used personal 
indexes. Zwemer has found that nearly every scientist 
surveyed in a recent study kept a personal file in the way 
of reprints, abstracts, or notes on cards, and that the 
average rate of growth of 26 such. collections is 330 items 
per year (4). In another recent study of the information 
needs of Department of Defense scientists and engineers, 
17% of the interviewed scientists and engineers used per- 
sonal files as their first source.of information, while 51% 
of the interviewed scientists and engineers relied on their 
local environment — personal files, departmental files, and 


colleagues — as a first source of information (5). Heller : 


and Wallace suggested that information specialists. be used 
to assist in the preparation of personal indexes. They de- 
seribe the preparation of personal indexes, printed with 
the aid of computers, for professional and administrative 


personnel in one organization, the Systems Development. 


Corporation (6, 7). А study now being done at Florida 
State "University carries this suggestion one step further. 
Personal indexes will be prepared for & group of research- 
ers in science and engineering and the use of these indexes 


1 Некеатоћ sponsored іп part by the Air Force Office of Selentific 
Research, Office of Aerospace Research, United States Air Force, under 


. AFOSR Grant Number 895-65. 


The structure of these personal indexes, their size, rate 


. of growth, frequency of use, physical form, and other. 
characteristics are given and discussed. 


G. JAHODA, RONALD" D. HUTCHINS, апа ROBERT 
R. GALFORD 


Library School | 
Florida State University 
Tallahassee, Florida 


will be studied. The study is being carried out in several 
stages. The first stage, the one that has been completed 
and is described in this report, consisted of a survey of 
personal indexes now in use. 

Information about the personal indexes maintained and 
used by Florida State University graduate faculty mem- 
bers in science and engineering was obtained by means of 


_ personal interviews. Faculty members to be interviewed 


were selected from the 1964-65 Graduate Bulletin of 


` Florida State University. The Chemistry, Food and Nu- 


trition, Biology, Geology, Mathematics, Physics, Meteor- 
ology, and Statisties Departments, and the Séhool of 
Engineering were included in this study. Only faculty , 
members in the Graduate School were selected since it 
was believed that these members were most active in ` 
research. Heads of departmente (with one exception) 
were excluded because their activities were considered to 
be predominantly administrative rather than research 
oriented. The original sample of 105 researchers that was 
selected in September 1964 was increased by seven (new 
appointments to the faculty) and reduced by 37 (resigna- 
tions; leaves of absence, or unwillingness to participate in 
the study). Thus 75 researchers have been interviewed, 
with 46 researchers having a personal index and 20 re- 
searchers not having a personal index. The interviews 
were conducted from September 1964 to July 1965 and 
lasted an average of 56 minutes with researchers who 
maintain & personal index. No record of time was kept for 
interviews of researchers who did not have a personal 
index. The results of the interviews are given in Tables 
1-16. 
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Таві 1. Distribution by Departments. | is; 
Biology Chemistry Geology Math Physics Other: Total 

















With Personal Index 16 — 84% 10 — 67% 5 = 71% 4 — 38% 4=50% T= 50% 46=60% `. 


Without Personal Index 8 = 16% 5 = 33% 2 = 29% 8 — 67% 4 == 50% 7 = 5090. 29 = 24%. Dr 








: Tanz 2. Distribution by Academic Rank. . . 
` Associate Assistant 














| Professor Professor Professor Total 
| With Personal Index 20 = 7190 19 — 68% .7== 41% 46 = 06% 


Without Personal Index 18-9295 11--327% 10--59% 29 = 8476 





Taste 3. -Number of Documents in Personal Index. ' 
Biology Chemistry Geology Math . Physics, Other . Total 











Less than 1,000 6 = 38% 2 — 2% . . 3--100% 1— 38% 4 — 67% 16 = 40% 
. 1,000 – 2,000 1- 6% 5 == 6395 1 = 25% - 1 = 33% 8 = 20% 
2,000 - 3,000 “4 = 25% 119% 195% | 1 = 89% ` 7 =-18% 
3,000 — 4,000 1-- 6% 1 = 2596 . ` · 2= 5% 
4,000 - 5,000 - 1= 6% . | ` '2:= 88% 3 = 8% 
5,000 —10000 ^ ^ 3 = 19% 1 == 2596 e 4 — 1096 


Number of respondents = 40. 











Тәнін 4. Rate of Growth Per Month.* 
5 Biology Chemistry Geology Math - Physics ^ Other | Total 
`1; 5 2=—4% 7. | 2 = 67% · 4 = 12% 


6- 15 8 = 57% 3 = 50% 1 = 3996 1 = 20% 13 = 38% 
16- 30 3 = 22% 2 = 33% 2 = 50% 2 = 100% 2 = 40% 11 = 32% 
81- 50 1 = 17% ` d 1-- 20% 2 = 6% 


51 – 75 > | P сі | 1-200 1-— 8% 
76 — 100 l= 7% 2 = 50% к. | = 9% ` 
Number of respondents — 34. ~ | í D 


* In number of documents. 


Тав 5. Age of Index in Number of Years. 














Biology Chemistry Geology Math Physics © Other Total 


Less than 1 · l= 6% : ( | :1= 2% 
1-5 ^ [= 6% 2 = 22% 2 == 4096 2 = 50% 2 = 50% 3 — 4890 12 = 27% 
6-10 . 2. 2:=18% 3— 3396 1= 2096 1 — 2596 2 = 50% 3 = 4396 12 — 279b 
1-1. 5 = 31% 3 = 3% 1--20% 1= 25% f | 10 — 22% 
16—20 ‚`8 = 19% 1 = 20% . · 1 = 14% 5 = 11% 
More than 20 4-- 25% 1 == 11% os 5= 


| 1% 
Number of respondents = 45. 2 








Taste 6. Frequency of Updating. А | : - 
Biology Chemistry Geology Math Physics ' Other Тоа]. 














Daily _ > I= 7% 1= 10% 2= 50% · P 21-33% 5 = 13% 
Weekly | 1= 7% (01 25% 1 = 3396 3 = 8% 
Monthly | '4— 27% 1= 10% 1-- 2996 . ~ 6 = 16% 
Ав collected ' 9 = 60% 7 = 70% >o. 8.==.100% 2 = 67% 2 = 67% 23 = 60% 
2-3 times per year. 1 == 10% n | : l= 8% 
Number of respondents = 38. 8 i ` ' 

: | | | у. 2! 
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L 
| Тавін 7. Types of Documents Included. 
. Biology Chemistry Geology . Math Physics Other Total 
Journal articles 16 = 100% 10 = 100%. 5 == 100% 4 = 100% `4 = 100% 7 = 100% 46 = 100% 
Conference papers 11 = 69% 58 — 50% 4 — 80% 2 — 50% 8 — 75% 3 — 43% 28 — 61% 
Government reports Б= 31% b5 — 50% 5 = 10% 1 — 2590 2 = 50% 5 — 71% 28 = 50% 
Patents 3 = 30% 1— 14% 4= 9% 
Technical correspondence ` l= 6% 3= 30% 1-- 20% I= 25% = 18% 
Lecture notes - | 1 20% 1-- 25% 2-- 5% 
ade literature à 1-- 2096 І == 25% 2 = 5% 
Books in their index 2-— 13% 1-— 10% 1= 2596 i= 14% = 11% 
т i= 6% . 1— 9% 
Seminar notes ` l= 10% 1= 2% 
Nümber of respondents — 46. 
| 
i Tania 8. Physical Arrangement of Original Documents. А 
| Biology ^ Chemistry Geology Math  Physies ` Other Total 
‘Subject + 8 = 50% 9 = 90% 4 — 80% 3 — 75% 1 — 25% 1 — 16% 20 = 58% 
Туре of document 9 = 56% 5 = 509 2 — 40% 1 = 25% 1 = 25% 3 = 50% 21 = 47% 
Author 8 == 50% 8--30%. - 1 = 25% 1 — 25% 2 = 33% 15 = 3396 
| Date . 1-09 | i 1 = 20% zh | 2 = 5% 
Accession number 5 = 31% "US 1 = 25% = 13% 
Number of respondents = 46. 222. 
* Includes several which are arranged by subject with subarrangement by author. 
| : Es v9 ug 
| Tase 9. Access Points. 
Biology Chemistry Geology Math. Physics ^ Other Total 
а | | | . 
Subject - 14 = 88% 10 = 100% 4 — 80% 3 = 79% 2 = 50% 4 — 57% 37 = 80% 
Author: 1 = 69% 4 — 40% 2 — 40% 8 = (5% 4 — 100% 4 = 57% 28 = 61% 
Title 2 — 13% | | 2 = 5% 
Project 4т= 25% 1 — 10% 1 = 2096 1 = 25%. 1 = 14% 8 = 17% 
Number of respondents = 46. `` nos 
Taste 10. Average Number of Access Points. 
Biology Chemistry Geology Math’ Physics Other Total 
1 8 — 50% 6 — 60% 5 — 100% 4 — 100% 2 — 50% 4 — 57% 29 — 6396 
2 3 = 19% 2 = 20% 1 = 25% 1— 14% 7 = 15% 
3 2 = 13% | 1--14% 3= 7% 
4 1- 6% ' i= 25% 2= 590 
T - 5 2 = 13% 2 = 20% ЕС 4 = 9% 
| 6-8 1= 14% 1- 2% 
ы è Number of respondents — 40. | 
| TABLE п. ‘Types of Subject Indexes. 
| Biology Chemistry Geology Math | Physics | Other Total. 
Alphabetical subject 4 =. 81% 1-- 13% 2 = 100% 2 — 100% 1— 33% 10 — 24% 
Coordinate : 3 = 23% 2 — 25% : е d.e 2— 867% 7 = 17% 
"Broad subject or classified 10 = 76% 7 = 88% 2 = 100% 3 — 100% 1-- 50% 1 — 33% 24 = 59% 
; Number of respondents — 31. Р Vo od 
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. Tanrk 12. Physical Form of Index. 





Biology Chemistry Geology Math Physics Other “ Total - : ic 








On cards “Га = 709 6 = 69 4-- 0% 1 = 2096 8 — 75% 4 — 10096 B= 70% 22 


File folders кз 7 == 4% Б:56% 2 — 40% 3 = 75% 2: 50% 2: 50%. 21 — 4996 - 
Page form ' 1-- 6% КЕ INE = 2% 
Pamphlets, boxes 2 — 13% : | 21 2= 5% 


Number of respondents = 42. | 








Тање 13. Amount of Bibliographic Information Included in Index Entries on Cards. 





- 


Biology Chemistry Geology Math Physics Other Total 








Citation or accession No. 
Citation and keywords 
' Citation and abstract 


8 
1 
3 


= 67% 8 = 50% 2 — 50% 1 — 10% 1 = 3396 1 — 25% 16 = 53% 
== 8% 2 — 67% |^ 810% 


= 25% 3 == 50% 2 = 50% | 3 = 75% 11 = 87% 











Number of respondents — 30. 


Tate 14, Frequency ‘of Use. 








Biology ChemiBiry Geology Math ` Physics. Other Total 








Daily ` 8 = 50% 6 = 6796 2 = 50% 2.— 50% 1 — 2595 2 = 3396 21 = 49% 
Twice weekly . 5 = 81% 3 = 8396 | 1= 17% 9 == 21% 
Weekly 1 = 25% 2 = 50% 2 = 50% 2 = 33% 7 = 16% 


Twice monthly 1 = 25% 1= 2% 
. Monthly 1=17% 1= 2% 


Sporadic 3— 19% ` | 195% . 4 = 9% 
Number of respondents = 43. А n 
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' Tasis 15. Age Group of Most Active Material їп Number of Years. 





Biology ‘Chemistry Geology Math Physics Other’ Total 








Up to2 - 3 = 19% 1 = 10% 1 = 25% 1 = 25% . B = 18%. 
"Up to 5 ' 6 = 38% 6 = 60% 2-- 40%. 4.— 57% 18 = 39% 
Up to 10 2 == 18% 3 = 80% 1 25% 2 = 50% ‚2 = 29% 10 = 92% 
Older than 10 2 = 13%. 3 — 60% 1 — 25% 1-- 25% 1 — 14% 8 = 17%: 


All equal 3 — 19% | 1 = 25%. . : 4 = .9% 


Number of respondents — 46. 
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Tasim 16. Shortcomings and Desired Improvements Suggested by Researchers. 


Biology Chemistry Geology Math Physics Other Total 





Too time consuming to prepare, 
Inconsistencies in indexing 
Not enough access points 
Lacks subject approach 
Collection inadequate 

` Not up to date. 

Subject headings too detailed: 

*.. Alphabetical arrangement 

> unwieldy 

Number of respondents == 19. 





5 = 56% 1 = 33% _ 1-585090 1 = 50% `8 = 42% 


2 = 22% 2 = 867% 2 = 67% .6 = 32% 
2 = 22% 1 = 38% ' ' 3 = 16%. 
1 = 1196 ie 1 = 50% 1 — 50% 8 = 16% 

ae | 1 = 3% ` 1= 8% 
1 = 11% zs 1 = 33% ps 2 = 1%. 
1 = 11% ` 1= 5% 
1 = 


u% 5s | 21-/5% 
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The first two tables give the distribution of researchers 
with and without a personal index by department and 
rank. The subsequent tables characterize the personal in- 
dexes with the tables being arranged in the order of the 
questions in the interview schedule. For each table the 
number and the percentage of responses are given by 
individual departments with the exception of the De- 
partments of Food and Nutrition, Meteorology, and the 
School of Engineering. The small number of interviewed 
researchers who had personal indexes in these areas led 
us fo group their answers under “other.” Several questions 
yielded more than one answer per researcher, e.g., types of 
documents included, and it is for this reason that the 
percentage figures for these questions add up to more 
than 100. 

The surveyed personal indexes do not constitute a sta- 
tistically representative sample because researchers in 
only one university have been studied and because the 
sample is too heterogeneous in terms of subject interests. 
Nevertheless, a number of observations will be made about 
the personal indexes, observations that might be tested 
on'a larger and more homogeneous sample. Most collec- 
tions are relatively small in size and are not now growing 
at'a very rapid rate. Of the 66% of the interviewed re- 
searchers who had a personal index, 60% contained 2,000 
or fewer documents, 78% contained 3,000 or fewer docu- 
ments. Eighty-two per cent of the indexes grew at the rate 
of 30 or fewer documents per morith. This raises the 
question of whether there are relatively few basic docu- 
ments of interest to the researcher or whether the library 
is able to fill the researcher’s need for basic documents, to 
consider only two possibilities. The researchers’ personal 
indexes are frequently updated, and this appears to be an 
indication of the importance of this tool to the researcher. 
Fighty-one per cent of the indexes are updated at least 
weekly (if we assume that updating “ав collected" means 
atleast weekly). 

While all indexes in the sample included journal articles 
(including reprints and preprints), other forms of publica- 
tions were not included in a number of files. This was not 
surprising in the case of patents since the resenrchers' 
interests were concentrated on the basic rather than 
applied sciences. The small number of indexes that in- 
cluded technical correspondence (1396) appears to be an 
indication that scientific information of more than ephem- 
era] value is not recorded in this form. Only 596 of the 
indexes included trade literature (mostly product, equip- 
ment, or chemical catalogs), but & subsequent check of 
the offices of 11 researchers showed that 10 out of 11 
researchers kept trade literature in their offices (though 
not in an organized form). Only 296 of the researchers 
indicated during the interviews that they include theses 
or dissertations in their personal indexes. However, sub- 
sequent visits to 11 offices indicated that seven out of 11 
researchers had theses and dissertations on their shelves. 

“Тһе response on the age of the most active material is 
also worth noting. Thirty-nine per cent of the researchers 
considered material up to five years old most active. Ma- 


terial up to two years old was considered most active by 
18% of the researchers. Of the 80% of the researchers 
who had a subject approach to their indexes, 59% used a 
broad subject (classified) arrangement, 24% an alphabetic 
subject index, and 17% a coordinate index. 

Shortcomings in or problems with the personal index 
were listed by 19 researchers. Forty-two per cent who 
responded to this question considered the time devoted to 
prepare their index as excessive, 3296 complained of in- 
consistencies in indexing (a problem of professional in- 
dexers as well), 169% desired more access points, 16% 
missed a subject approach (this represents three out of 19 
researchers who answered this question and did not have a 
subject approach), and only 5% considered their collec- 
tion inadequate. Almost half (49%) of the researchers 
used their indexes daily, another 39% used their personal 
indexes at least weekly, according to the interviews. 

Researchers who indicated daily use of their personal 
indexes were asked to participate in the next stages of the 
study. This consists of collecting case histories of personal 
index use, analyzing these individual case histories to 
determine what type or types of indexes appear to be 
most suitable, designing personal indexes based on this 
analysis, and studying the use of these indexes. The collec- 
tion and analysis of case histories of use of seven personal 
indexes are now underway. 
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State of the Art of. Computers in Commercial Publishing | 


P Despite excessive glamorizing of the role of computers 
in publishing, truly economical applications are be- 
ginning to emerge. Many types of directories, indexes, 

' cumulated bibliographies, and cumulated library book 


M catalogs can today be put through a computer in 1655 ` 


time and at less cost than for conventional typesetting. 
.. Examples of successful applications are given, after 


® Introduction 


This paper starts with a dream. In this dream there 
. ig a huge room, filled with hundreds of blondes, brunettes, 
` and redheads. They are all retyping manuscripts in 


rhythm, to the tune of the Stars and Stripes Forever, on, 


tape-punching typewriter keyboards. - Nobody worries 
‘about justification, punctuation, widows, rivers, or the 


other little details of typography that plague a Linotype - | 


operator. The typewriter hard copy is proofread, correc- 
tion tapes are punched, and all the tapes are fed into a 
computer at 500 or 1,000 characters per second. 

The computer justifies and hyphenates the copy as per 
typographical specifications. It also inserts repetitive 
words, supplies headings, makes up pages, eliminates bad 
breaks, calculates total length, and even spreads out or 

` squeezes together the lines to eliminate blank pages in the 
‘last printing form. 

The final output tape of the computer is error-free, 
because our computer does not make mistakes. The out- 
put display machine, equally error-free, converts this tape 
into negatives or positives, ready for offset or letterpress 
plate-making. Since this dream system makes no mistakes, 


~“ there is no need for galley proofs or page proofs. This 


means no more printers alteration charges, because au- 


У . thors wil no longer get а chance to rewrite their manu- 


scripts on galleys or page proofs. 


.: бше this sounds like a dream. But ЕТА | 
x without too кө an investment, we could make this ^ 


reso Information Research, McGraw-Hill, Inó., New York. 
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taking up one by one the many system design de- 
cisions that must be, made in the areas of input hard- 


` ware, computer ‘hardware, computer software, and 
‘output hardware. Major attention is given to the 


proofreading problem, because line printers are still : 
far from ideal for producing the equivetenk of galley 
proofs. 


JOHN MARKUS* ^ 


McGraw-Hill, Ino. 1С, ds 
New York, New York і gee S 


dream come true today for practically any printing job, 
if we wanted to. Now I must sound a warning: Pushing 
technology too fast could change our dream into a night- 
mare. Ав many of you have noticed, quite a few parts ‘of 
this dream are fuzzy. For-instance, 1 didn’t say whether 
these girls were in the publisher's office, a printing plant, 
а central service bureau, or а huge and drafty old castle 
in England. I didn't say where the computer was. Like- 
wise, I didn't say who had the output machine, or what 
it was. All I did say is that we ended up with negatives 
or positives that were ready for conventional pine 
making and printing. 

Now let’s see why we have to be 80 vague — why we 
need answers to so many questions before the ultimate 
role of computers in publishing can: be pinpointed. 


9 System Design 


The chief problem in computerized publishing is that | 
there аге so many different types of equipment and so 


" many different techniques, each with advantages and 


drawbacks, for each part of our system. From these many 
variables, we must find the optimum system combination . 
that wil meet specific present and future publishing 
needs, reliably and economically. The glamour and pub- 
licity of new computer hardware must not lure us into ` 
premature action. This system design. problem divides 


‘into four parts: 


1. Input hardware, which converts the саи 
into machine-readable form. : 


HM 07 


~ 


Computer hardware, which does the creative part 

of the printing process. 

3. Computer software, including programs that pro- 
vide for the production of proofs, correction of 
errors, and updating of subsequent editions. 

4. Output hardware, consisting usually of a photo- 

composition machine that converts the computer 

output tapes to negatives or repro copy for 
printing. 


е Input Hardware 


A computer will accept editorial copy only in machine- 
readable form. This can be punched cards, punched tape, 
magnetic tape, or scanner-readable typed characters. Our 
manuscript must therefore be retyped, or rather key- 
boarded, first. Here we have three basic choices: 


ТАРЕ PERFORATORS 


. Most computer composition systems in operation today 
use tape-punching typewriters as input hardware. Flexo- 
writers and Dura machines built around standard IBM 
electric typewriters are the most popular. Prices range 
from $1,500 to $2,500 each. There are also tape per- 
forators without printing facilities, such as the Fair- 
child TTS perforators used in many printing plants. At 
least one of the machines should also have a tape reader 
_ to permit checking the performance of the machine by 
| running punched tape through the reader and checking 
the hard copy produced from the tape. 
The choice of tape-punching typewriters is difficult. On 
' the Dura Mach 10 tape-punching typewriter, high typing 
speeds are possible because shift and unshift codes are 
punched automatically when the typist hits the conven- 
tional shift keys. On Friden Flexowriters each shift code 
must be punched separately by the operator. Unfortu- 
nately, some Dura machines get out of adjustment and 
occasionally lose a shift code when typists work at their 
maximum speed. Slowing down the typists eliminates this 
malfunction, but output is then down to that of Flexo- 
writers. This leads to the conclusion that there is as yet 
no ideal tape-punching input machine for computer 
composition. 

Punched paper tape for computers is often called idiot 
tape by printers, because it can be produced by ordinary 
typists without years of special training. This input tape 
contains nothing more than code equivalents of typed 
characters, without hyphenating or end-of-the-lme in- 
dications, and with a minimum of special control бона. 
ters for designating type font changes. 

А. typewriter keyboard has many advantages over the 
Linotype or Monotype keyboards currently used in print- 
ing plants. First of all, ordinary typists сап be used for 
keyboarding of composition, &fter no more than а few 
hours of additional training. These typists can generally 


turn out at least 20% more work than is ordinarily 
obtained on keyboards of hot-metal casting machines. 
One reason is that а typist has only 44 keys to hit, while 
a Linotype operator has more than twice as many. An- 
other reason is that typists do not have to slow up for 
the end-of-line justification or hyphenation decisions. The 
typewriter also gives hard copy immediately for proof- 
reading. 


KEYPUNCHES 


Our second input possibility is a standard keypunch, 
driven by an essentially standard typewriter keyboard. 
The punched cards produced here can be imprinted either 
simultaneously with punching or in a separate machine 
to provide a line of printing across the top of the card 
for proofreading. Keypunching is viewed as an interim 
input measure for use with computers that have only 
punched-card input equipment, however, because 15:16 
slower and more expensive than punching paper tape. 


SCANNERS 


Controlled-font typewriters can be teamed up with a 
character-reading scanner for use as input hardware. 
Here editorial copy is retyped on an electric typewriter 
having a special font of all-caps characters than can be 
read accurately by a photoelectric scanning machine, such 
as the Farrington and Control Data Corp. scanners. 
Special controlled-font balls are now available for the 
ІВМ Selectric bouncing-ball typewriter. 

Scanners generally deliver magnetic tape, ready for use 
as computer input. The scanner approach is attractive, 
but the all-caps limitation of the lower priced scanners 
is a drawback for input typing and for proofreading the 
hard copy. 

An example of book composition as it might be typed 
for an all-caps scanner, in Fig. 1, shows how function 
codes might be used to obtain the desired typography. 
Note that the all-caps input lines do not end on the same 
words as the final typeset copy, because the computer ig- 
nores typewriter carriage returns. Note also that the com- 
puter has adjusted the spaces between words to achieve 
justification (lineup at the right). It could also hyphenate 
words whenever necessary to avoid excessive space be- 
tween words. 

It is possible today to purchase a scanner that will read 
both caps and lower case typing, in regular or controlled 
fonts. Cost is much more than for an all-caps scanner, 
however, and accuracy is still open to question. One of 
these, the Retina machine made by Recognition Equip- 
ment Corp., is being tested by Perry Publications in 
Florida for setting newspaper classified ads. Phileo sean- 
ners have multi-font as well as lower case capability at 
correspondingly higher prices. 

When & scanner for ordinary typing becomes available 
ађ а reasonable price, the pages of an author's manuseript 
could be fed directly into the scanner without retyping 


~ 
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: * AT CONCLUSIONS—MAKES DECISIONS—IS INTERESTING TO.EXAMINE. 
52 PLACE: 

. А SET OF "FACTS".IT ALMOST ALWAYS DEVELOPS THAT THESE SO—CALLED 

2 ARE IN REALITY АМ АІ ESTIMATE AR OF THE TRUE ҒАСТ5. | 


THAT IS, 


COMPUTERS AND PEOPLE BY POSTLEY NCGRAW-HTLL 


PERFORMED BY PEOPLE. THESE PEOPLE, 


EC AB E DECISION MAKING AR IS AN ACTIVITY WHICH HAS HISTORICALLY BEEN 
INCLUDING YOU AND ME; 


SEEM RATHER: 


DEFENSIVE ABOUT THEIR EXCLUSIVE PREROGATIVES TO PERFORM THIS ACTIVITY. 


DEFINITION OF DECISION MAKING F1 AR FP 


„ 


DECISION MAKING is an activity which has historically been 


performed by people. These people, including you and me,. 


seem rather defensive about their exclusive prerogatives to 


is perform this activity, that is, their „prerogative to “make 


' decisions." 


A definition of decision making : 


The process by which one arrives at conclusions—makes 


' decisions—is interesting to examine. In the first place, 


Fic. 1, 


^ € Computer Hardware — . 


while these conclusions are usually said to derive directly 
from a set of “facts,” it almost always develops that these 


so-called “facts” are in reality an estimate of the true facts | 


THEIR’ PREROGATIVES TO "MAKE DECISIONS." 


Ра АТ ЈА 
THE PROCESS BY WHICH ONE ARRIVES 
IN THE FIRST 


WHILE THESE CONCLUSIONS ARE USUALLY SAID TO DERIVE DIRECTLY.FROM . 


"Е ACTS "n? 


FUNCTION’ CODE COMMAND 


H 
FC START NEW CHAPTER 
“ЕР START NEW PARAGRAPH 
+1 1 LINE SPACE 
һа 2 LINES SPACE 
.&à FONT CHANGE COMMAND 
АВ САР AND SMALL САР 
` АТ BELL GOTHIC BOLD SUBHEAD 
AR NEWS GOTHIC BODY TEXT | 
АІ ITALICS OF BODY TEXT 
1 CAPITALIZE NEXT LETTER 


- PERIOD FOLLOWED. BY SPACE 
MAKES NEXT LETTER CAP · 


Example of book composition ав typed for Farrington scanner (top), final output сору Vows left), and meanings ‹ of: : 


control code characters used by typist. 


for conversion .to magnetic tape. боа penel e 


." could be made at points where editing is desired. These 
` ' would make the scanner read editing changes and cor- 
. rections that have been typed between the lines or gen- 


erate a- code that tells the computer to take the desired 
change from. iE source. ^ 


£ 


Many different general-purpose computers ёте suitable 
for electronic processing of manuscripts. The IBM 1620 


. _ general-purpose computer leads the picture here, with 
` several dozen being used in printing plants as well as in 


' newspaper applications. Software for newspaper com-. 


.' in popularity is the RCA 301, for which computer type- | 


E 


position on this compute? is available from IBM. Next 


setting software has also been made availablé by the ' 
manufacturer. Other general-purpose computers used for 
composition inelude the Honeywell 200, Digital Equip- 
ment’PDP-8, several Control Data models, the IBM 1400 
series, and NCR 315. A few of these are used exclusively 


| ., for composition, ‘but in most installations the primary | 
application is accounting. Composition work, for books 


in particular, is generally run during idle time on ‘second 
or third shifts. | | 
Choice of & partieular general-purpose computer will 
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-generally be based on such factors as availability of the 


necessary operating time, availability of some or all the | 
necessary software from the manufacturer, and avail- 
ability of compatible input and output composition hard- 
ware for the computer. Size of the internal storage and | 
speed of the input and output hardware are other factors’ ` 


_ to be considered. The output paper-tape punch speed is _ к 
- particularly important because it is generally operated: · `` 
on-line. A fast line printer is another asset; even better :. 


is one having а cap and lower case chain. 

, In general, the sum of the times required for the input 
reader; line printer, and output punéh to handle their | 
respective total characters will closely approach the total 
computer operating time for & partieular composition 


job. The actual processing runs involving only magnetic Ey 

. tape handlers: take a small percentage of the operating ^ е 

: time on modern high-speed, general-purpose computers. ' 

This fact makes it possible to estimate composition run, ``... 
times witha high degree of accuracy, assuming avail- + s 
ability of debugged programs, a good computer room .. 
‚ operating staff, and an input that coritains’ the correct .. 


control codes needed for ‘proper processing by the 


computer. 


Service bureaus offering о computer composition include г 
National Computer Analysts, Rocappi, and Documenta- . 


` tion, Inc. These firms operate in competition with com- 


mercial printers because up to now only a few printers 
аге using their computers for composition. 

Special-purpose computers for composition include the 
Mergenthaler Linasee, Compugraphic DTP, three Harris- 
Intertype computers, and the Fairchild Comp/Set. These 
are intended chiefly for newspaper and printing plants, 
for use with tape-controlled hot-metal or photocomposi- 
tion machines. 

The most expensive of the three Harris-Intertype 
models provides a combination of logic and magnetic- 
drum dictionary lookup for automatic hyphenation. A 
simpler model uses only logic for hyphenation, so it 
occasionally inserts hyphens by guessing, when logic fails. 
Thé simplest and cheapest model stops automatically 
whenever it reaches a point where a word must be 
hyphenated to allow a human operator to make a decision 
on where the hyphen should go. 


® Software Problems 


A computer can only follow the instructions that are 
stored in its electronic memory. These instructions are 
known as software. Each program of instructions must 
allow for all possible combinations of input problems, 
yet must fit into the available core storage. 

Computer software and computer hardware together 
serve to convert low-cost keyboarding by a typist to 
whatever is needed for driving the output composition 
machines. To do this, the software must include one or 
inore of the following composition functions: (а) format 
hnd typography control; (b) justification; (c) hyphena- 
tion; (d) page makeup; (e) code conversion for the 
output hardware. 

The control codes that specify line widths, indentations, 
type fonts, italicizing, bold-facing, subscripts, super- 
‘scripts, leading (vertical spacing between lines), tabu- 
lating, and other parameters of composition must be 
converted by computer software into the units of length 
on whieh the computer bases its running width total of 


the characters already set in each line. To do this, the | 


computer must be given beforehand the exact width of 
each character, in each of the fonts being used in the 
` job being run. For justification, the computer must also 
· be given the range of word spacing that will be permitted 
when justifying lines for a particular composition job. 
The greater the range, the less will be the need for 
hyphenating words. When the computer comes to a 
word that won’t fit in the specified line width, it must 
reject that word and increase the spaces between words 
to fill the line. i 

A refinement of the justification program involves re- 
arranging words in previous lines in a paragraph when- 
ever a line cannot be justified within specified limits with- 
out hyphenating. The computer tries to transfer words 
from the end of one line to the start of the next, working 
back to the beginning of а paragraph, апа staying within 


the word spacing limits, to see if a. different arrangement 
of words ean be achieved that will eliminate the need for 
hyphenating. This technique naturally works best for 
wide columns such as are used in books. 

So far, software programs for computer composition 
have been written for specific jobs. This means that they 
must generally be revised rather extensively when changes 
in format, typography, or content are desired. At the 
present time, however, both RCA’s Graphic Services 
Division and National Computer Analysts are working 
on programs that will be more or less universal, so as 
to permit reasonable changes in format. These programs 
are much more costly to produce, but will be cheaper in 
the long run because their costs can be spread out over 
a number of users of computer composition. 


HYPHENATION 


The software goal envisioned by many in computer- 
controlled typesetting is the introduction of hyphens cor- 
rectly at ends of lines by means of a program that in- 
volves only logic, with no dictionary lookup of exception 
words. Perfection will never be achieved here as long as 
dietionaries are used ss hyphenating guides, because 
proper names and many ordinary words in dictionaries 
do not follow logical rules for hyphenation. Scientific 
words derived from proper names (such as wattage, 
hyphenated watt-age because named after James Watt) 
are another headache, as also are words that are spelled 
ihe same but hyphenated differently depending on pro- 
nunciation. Worse yet, dictionaries do not agree on 
hyphenation. Even the Second and Third Editions of 
Merriam-Webster's Unabridged Dictionary differ from 
each other. 

One program is available today that can equal or better 
the accuracy of human hyphenation while relying еп- 
tirely on logic. This was written at National Computer 
Analysts in Princeton for an RCA 301 computer, and 
has been rewritten for Control Data and Univac com- 
puters. In developing this software, the computer was 
programmed to hyphenate test words in every possible 
position, compare its work with hyphenation as given in 
the Third Edition of Merriam-Webster Unabridged, and 
place asterisks ahead of each word having a hyphen in a 
wrong position. The resulting printout, part of which is 
shown in Fig. 2, was then used as a guide for further 
refining the byphenating logic. 

Here is a suggestion. Why not line up a stable of ex- 
perts in linguistics, logic, typesetting, and proofreading to 
establish a consistent set of rules for hyphenation that are 
compatible with computer processing? A computer pro- 
gram could then be written from these rules to produce 
a word list showing the new and logical hyphenation for 
every word in the English language for publishing as the 
new standard hyphenating guide. Could we agree on a 
basic standard for hyphenation, based only on logie? | 

The majority of computer-controlled typesetting opera- 
tions today are using a, combination of logie апа diction- 
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BIL-LOWY 


BIL=LOWY 
BIL-LY |». 7 BIL*LY 
-  BI-ME-TAL-LIC Фе BIMETALL-IC 
Bl-MET-AL* 18м ә ВІМЕ-ТА = [SM 
ВІМ BIN | 
BIND BIND | 
. BIND-ER BIND-ER 
8IND-ING BIND-ING 
BIN-0C-U-LARS- BINOCULARS 
BI-08-RA-PHER -0G-RA-PHER 
BI-0-GRAPH-[-CAL эе B[IOeGRA*PHIsCAL 
. -BI-08-RA-PHY BI-0G-RA«PHY 
. BI -0-LOG*1-CAL ` Bl-0LOGI-CAL 
81-01-0-8165Т B1-0LO-GIST 
BI-0L-0«8Y B1*0LO-GY 
BIRCH H 
BIRD BIRD. 
BIRD=CALL BIRD-CALL 
BIRD-IE . жә BIRDIE 
BIRD'-S/EYE BIRD'S/EYE 
BIR-MING*HAM B IR-MING-HAM 
BIRTH RTH 
BIRTH-DAY BIRTH-DAY 
BIRTH-PLACE | BIRTHPLACE 
BIRTH*RIGHT . ** BIRTHRIGHT 
` BIS-CUIT : Фе BISeCU-IT 
BI-SECT "A BISECT 
' | "BISH-0P *« BI-SHOP 
>  BISH-OP-RIC ж» BI-SHO-PRIC __ 
s BIS-MARCK | BISeMARCK Е 
JS BI-SON . BISON 
BIT BIT 
BITE BITE 
BIT<ING BIT«ING 


- „Ела, 2. Early test of automatic hyphenation by logic 
`. alone, Left column shows hyphenation of Merriam-Webster 
‘Unabridged Third Edition and right column shows hyphena- 


` tion by National Computer Analysts logic routines. Ав- 


. terisks indicate misplaced hyphens. Hyphens omitted by 
- ` computer were not counted as errors here, but newer test 
| program counts both omitted and erroneous клеш, | 


ату lookup for TONS ‘Here, when the computer 
reaches a word that must be hyphenated, it first looks 
‘in its internal dictionary of problem words that have been 


stored with hyphens in all possible positions. If the word . 


is found there, the computer chooses a hyphenating posi- 
- tion that places the line within the specified justifying 
; limite. If the word is not in the internal dictionary, the 
2 ‘computer then applies its stored rules of logic, based on 


- common word endings and common combinations of ` 


-` letters, for placing the hyphen. If the word is one of the 
' exceptions for which logic rules do not apply or have not 
yet been written. into the program, the computer can 
arbitrarily place the hyphen after an odd-numbered 
letter, because statistics show that most hyphens fall after 
the 8rd, 5th, 7th, etc., letter in a word. In one program, 
_ when none of the men rules applies, the computer in 

| effect flips a coin and places the hyphen arbitrarily before 

_or after a convenient vowel. 
_ A third hyphenating alternative, which no one has tried 
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as yet because it calls for more internal memory ‘capacity 
than is available in. any.computer, would involve storing 
all of the words in the Merriam-Webster Unabridged 
Dictionary with all hyphen positions, for complete dic- 
tionary lookup. With a sufficiently large capacity іп a 
tandom-access disk file or other large memory, it is not . 
inconceivable that this approach may be tried in the. 


ў future. 


$ 


Finally, at the piker extreme js ania} Е 
wherein we have computer software that stops everything 
and sounds an alarm when a word is reached that requires 
hyphenation. An operator must then look at the word as ` 
shown on a computer display or as typed out by the 
computer and indicate where the hyphen should go. This 
manual operation is at present used only with small 
special-purpose computers, because it fails to utilize the 
hyphenating capabilities of a computer. On the other 
hand, with wide-column book composition an appropri- 
ate justifying prograin could reduce the need for hyphens 
to less than one line in 100. Here it may be more eco- 
nomical to use this combination of automatic and human 
procedures. 

For. page makeup of books, the available newspaper 
composition programs need to be expanded to cover. 
handling of illustrations, computation of book length, and 
elimination of bad breaks such as widows, rivers, and 


' heads at the bottom of a page or column. The prepara- 


tion and testing of suitable programs for publishing re- 


‘quirements could be costly and time consuming for an. 


individual publisher. Standard composition programs are 
now being prepared by some service bureaus for the 
benefit; of all of their publisher clients. 


ERROR DETECTION 


20M 


7 IBM is already carrying out research on the use of а. 
computer to detect errors in spelling and typing of edi- : 
torial copy. One obstacle here is the size of the computer 
internal storage required for holding all of the words m 
the English language, with high-speed access to each word 
so the process does not appreciably slow up computer 
processing. Another drawback is that some typing errors . 
create acceptable different words: 


PHOTOCOMPOSITION CONTROL CODES 


The last part of the software for computer-controlled 
typesetting is a program for converting the code format. 
of. the computer to that required by the composing 
machine and adding the necessary machine control codes. 


` Computer manufacturers have developed this conversion 
Software for the Teletypesetter tape required for hot- 


` metal newspaper composition, and will presumably have 


conversion programs for photocomposition machines 
шшш 


| 
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а. 3. Use of lesser-than' Sign to їй бөлө shift to caps and greater-than sign id indicate shift down to lower case as re- 

» d qued for мешн accurately from line printer proofs. 
ds | Б | ss Шаршен tape of the computer, however, so we need at 
А least one more proofreading. How do we get it? 
Normally, typeset material is aitean twice, on galley. - Use of a computers own line printer to produce & 

proofs and on page proofs. Computer composition will display for proofreading is one logical answer. A major 
unfloubtedly change this. Here are some of the possibili- drawback, however, is that most line printers have only 
ties, that we must explore to determine the optimum. capital letters. Special programming is then needed 40. 

proofreading procedures for-each system from the stand- identify the letters that must be capitalized in the final 
point of economics, accuracy, and acceptance by editors . output. 
and authors. ! - , . One method of indicating capitalization is based on the ` 


Proofreading of the hard copy-produced on input tape- ` .  fact-that three characters must be punched on paper tape 


TE typewriters could be the only proofreading each time a typist hits a capital letter. First comes the ' 


needed in an error-free computer system, if we could upper-ahift character, produced when she touches the shift 
count on the typewriter and its punches to work per- key. Next is the actual character, followed by the down- 
fectly. We can’t yet. Furthermore, the hard copy at the shift character that is punched when she releases the 
input is not typed to final column width, hence there can · 

no checking of hyphenation, widows, and the other lesser-than and greater-than signs surrounding the letter 
f etors that determine high-quality composition. The hard . ог letters that are true caps on line printer proofs, ав in 
input copy will be cap and lower case, however, and can. 
ve code characters to indicate italics, bold face, special this approach.) 





| The hard copy that is produced as a by-product of the two shift characters from the printout and, instead, 
tape-punching is thus quite useful for detecting human place.an open lozenge or other special character on the 
errors in keyboarding. It will not show up machine errors line bélow, directly under each cap character, as in Fig. 4. 
бте between the fingertips of the ne and the ` ` This makes proofreading much easier but doubles line- 
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shift key. These shift characters can be converted to — 


Fig. 3. (Yes, we do get complaints, from proofreaders with — 


characters, and other desired typographic changes. Another method involves more programming to remove ` 
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иш 


printer running time. This method is better from the 


R stàndpoints of proofreading accuracy, proofreader morale, 

-and labor vs. machine-time costs. 

| A cap-and-lower-case printer would give a more read- ` ` 
· able.proof, but special characters would still be needed 
^. for italicizing, bold-facing, and characters not on the 


chain. The cost of this chain and the additional computer 
circuitry required must be weighed against the value of 


: the lower-case printout for proofreading and its value in 
. other applications. As one example, this cap-and-lower- 


case chain might be used to produce more readable repro 
copy for indexes at low cost compared to photocomposi- 
tion or hot-metal typesetting, though at a serious penalty 


' in characters per inch since the letters are uniformly 


spaced. Also, for a given level of readability, the cap-and- 
lower-case printout cannot be reduced in size as much as 


. an all-caps’ printout and will therefore require more 
“printed pages per job. 


Proofreading of output PU from Xerox 


` -copies of paper repro proofs or from blueprints of nega- 


tives is the ideal’ solution, though probably the most 
expensive. Here we see exact equivalents of conventional 
galley and page proofs showing all formats and all fonts 


' of type called Шы 


t CORRECTIONS 


After errors have been caught by proofreading, paper 


x tapes for the corrections must be punched with appropri- 


` ate record, field, line, or other identifying codes. These 


correction tapes are fed into the computer for updating 


` the nature and amount of corrections: 


2 


~ the master tape file, since this will be used for subsequent . 
. editions. Five procedures are now available, черек on 


1. Use the ‘computer to punch output tapes only for - 


lines requiring change, run these through the photo- 
composition machine, and get corrected lines for 
stripping into the negatives or pasting on the repro 


positives. This is generally the preferred procedure.’ · 


Use the computer to punch the entire output tapes 
over again for a second complete run through the 
photocomposition machines. This costly procedure 
wil probably be preferable when- there are drastic 

- changes, such as might occur when a publisher 
allows an author to rewrite his manuscript on 
galley or page proofs. 


.8. Reperforating the punched paper input tape, with 


manual typing only when an erroneous word is 
reached, to obtain a perfect new tapé for use as 
computer input. Chief drawback here is almost a 
‘doubling of input keyboarding time with mo as- 
‘surance, that new errors won't be made. 

Splicing of corrections into the original tape. This 


requires a person who can read punched characters, . 


hence is rarely done. ` 


5. Merging of correction tapes with the original ühe ; 


by operator switching of two readers. This takes 


" American Documentation — April 1966 


а ire ЭЕ 


even more input operator time se the bud pro- И 
cedure and can give new errors if readers ате not 
' switched back and forth at the correct instants. 


9 Output Hardware 


Output hardware includes everything needed to con- 
vert output magnetic tape of the computer to hot metal 
or to photographic negatives for plate-making. ‘When the 
output machine requires punched paper tape, the com- 


` puter itself or an off-line converter must drive а high- 


speed tape punch. With output machines like Photon 
ZIP and the forthcoming CBS-Mergenthaler Linotron, 
which will take magnetic tape directly, mag-to-paper 
conversion will not be needed. In commercial publishing, 
the punching of output paper tapes is generally done on- 
line, even though ‘this means many more hours of 
computer time. 


HOT-METAL OUTPUT 


Many Linotype апа Intertype casting machines will 
operate from punched paper tapes. The chief drawback. 
to & hot-metal output machine, however, is the inherent 
mechanical error rate of the machine. In general, a ma- . 
chine malfunction will cause an error in about one out of 
every 50 cast slugs even with the best possible machine 


. operation and maintenance. Until this error rate is 
eliminated, the trend in computer-controlled typesetting 


will logically be toward photocomposition machines, be- 
cause these can theoretically operate without error. 


- PHOTOCOMPOSITION OUTPUT 


Machines currently available for photocomposition 


. from punched paper tape include those made by Photon, 


Harris-Intertype, ATF, Mergenthaler, and Alphatype. | 
More are undoubtedly under development. These use a- 
punched paper input tape to control the exposure of a 
film negative, character by character, with precise, posi- 
tioning of characters and precise control of spacing be- 
tween lines, as determined by the codes produced during 
computer processing. The choice of a machine must be 


` based on cost, speed, reliability, quality, the number of. ` 


fonts of type required, the convenience of obtaining spe- 
cial characters, and the availability of. backup тшше 
if breakdowns occur. | 


ZIP 


. The high-speed Photon ZIP machine installed at Ше. 


National Library of Medicine is the only machine that 
today takes computer magnetic tape directly. Its success- 
ful performance in producing Index Medicus’is being 
closely studied by publishers and printers. The price tag 
today is $200,000, with the user furnishing his own com- 


: puter. This first ZIP is setting about:300 characters per 


TES RSD aes и 


А 

| 

| 

| . 
second, which means that it can expose all the negatives 
for aj 100,000-word book in less than an hour. This is 
equal'to about 30 Linotypes. ZIP holds 264 characters at 
а time, and these сап be changed by sliding in a new set 
of glass matrices. 

Тһе quality of the output of ZIP is satisfactory for in- 

dexeg and directories, but most book and magazine pub- 
lishers | require better work. Chief defects are erratic 


vertical positioning of letters and variations in letter. 


density! . 
Each} line on the Nghe for Index Medicus is ex- 


posed across all.three columns, because their computer . 


' also does page makeup. The internal memory or external 
hust therefore have. enough capacity to, hold the 
of an entire page,. in addition to the required 





1 some New York City telephone directories, 
for use Dy operators. The third is scheduled to go to 
England.|There is real hope that an improved ZIP will 
-take the magnetic output tapes of the computer directly 
and give graphic arts quality in a variety of type fonts. 

With a high-speed photocomposition machine that 
accepts magnetic tape directly, there may be less of a 
cost penalty for using the machine to produce proofs. The 
machine is|so fast that it would likely be idle part of the 
time on second or third shifts, if not on the first shift. It 
is then logical to use the machine for producing either 
paper prints or film negatives for proofreading, since the 
out-of-pocket extra cost is only for the photographic proof 
paper. 


PHOTON 713° 


A few of the new Photon 713 photocomposition ma- 
chines are now in the field. Early reports indicate that 
this machin has real promise for producing composition 
of graphic, quality from punehed paper tapes of com- 
puters. Тһе | specifications definitely offer advantages 
over other р otocomposition machines available today. 
Speed is about 20 characters per second, with a choice of 
720 characters or eight full fonts in eight type sizes, in 
ither negatives or repro positives from 
| tape. Price is about $50,000. Photon is 
or $15,000 additional, the same capability 







high interest to printers and service bureaus because it 
eliminates the low and costly extra step of using a com- 


puter to convert magnetic tape to punched paper output . 


tape. || 


FOTOTRONIC 


Another promising photocomposition machine for _ 
punched paper output tapes of computers is the Harris-.. 
i А 


i 
i 


version, if it performs reliably, will be of ' 


4 


Intertype .Fototronie, now selling for around $55,000. 


This has the same rated speed as the Photon 718, but 
actual throughput varies with the number of type size 
and font changes required on a given job. 


' CATHODE-RAY PHOTOCOMPOBITION 


"Малу cathode-ray character-generating systems have 
been proposed for direct operation from magnetic tape, 
but as yet none is on the market in commercial form. 
The Government Printing Office awarded jointly to Mer- 


genthaler Linotype Company and CBS Laboratories a 


$2,185,000 contract to produce two such machines, called 
Linotrons, for 1966 delivery. Speed varies with type size, 
because it takes longer to make an electron beam create 
a larger character from a pattern of fine lines, but is ex- 
pected to be much faster than ZIP. For making paper 
proofs, which can be lower in quality as long as they are 


readable, the Linotron can run at up to 5,000 characters 


per second. Linotron will give a choice of 256 characters 
in one basic font that can be chafiged electronically to any 


: of eight different sizes ranging from 5 to 18 points. Reso- 
` lution is claimed to be better than Linotype hot-metal 


work but not quite up to Linofilm photocomposition 
quality. A price of $500,000 is being quoted for a com- 
mercial model. 

Other firms working on the cathode-ray approach in- 


| clude the RCA Graphic Services Division, Alphanumerics, 
' K. 8. Paul & Associates in England, and Rudolf Hell in 


Germany. 

The role of costly, high-speed cathode-ray machines in 
commercial publishing is as yet unknown. Smaller and 
slower machines can share the work and back up each 
other. А fast machine is economically unsound unless it 
can be kept busy or can be justified on the basis of its - 
high speed for a few specific jobs. 


LINE PRINTER OUTPUT 


With proper adjustment and operation of a high-speed 
line printer, using a high-quality new nylon ribbon and 
a high grade of paper, repro copy can be obtained directly. 


‘from a computer. This can then be reduced in size photo- 


graphically to produce plates for offset printing of in- 
dexes, directories, and other types of reference books. · 
Alternatively, paper masters can be produced directly on 
the line printer if the original large type size is acceptable. 

The two chief drawbacks of line printers are the waste 
of space inherent in uniform spacing of characters on a 
line ‘printer and the objections of users to the all-caps 
print-out of most line printers. The cap-and-lower-case 
chain that is available for the IBM 1403 line printer gives 
improved readability in the new typewriter-like font, but 
the spacing is still the same. In one case, for Index Medi- 
cus, this cap-and-lower-case printer is backup for the 
Photon ZIP, but an emergency issue produced on the 


‚ line-printer will take twice as many pages. 


Tho line. printer is envisioned only for jobs where less 
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Mat р arts quality can be їй 'and space re- 
quirements are not: critical. Indexes, book-form library 
- catalogs, parts lists, ‘and some directories are examples of 
Ко for which a line g primder should be considered. 


С Eönomital Apollinis 


Although. ар progress has been made іп “computer 


composition, there are today very few economical com- 
‘mercial applications that provide graphic arts quality. 
One reason for this has been the lack of photocomposition 
: machines ‘suitable for computer composition. Another is 
the'lack of completely versatile software.- - 


Production costs drop most spectacularly when print- | 


ing jobs make maximum use of the data processing сара- 


bilities of computers, as with indexes and directories. Here > 


` ате some job ‘characteristics that favor use of computers: 


1 1. A need to explode information во a given item of 
input is duplicated many times by the computer. 
3 А need to sort items of information, so they appear 
in one or more desired sequences regardless of the 

- order in which the items enter the computer. 


3. A need for updating the information, by cumulating 


corrections and new material with old material, so 
the computer сап be used to eliminate rekeyboard- 
ing of the unchanged older material. 


.4. A need for speed that overrides cost considerations. > 


"Опе publication that met thes characteristics was the 
m annual Electronics Buyers’ Guide.. This “McGraw-Hill 
“publication tells who makes the components and equip- 
"ment that constitute the electronic industry. The input 
` data for this ditectory is the equivalent of 300 pages 
.,in print, while the. output or final directoty is almost 


"tion. There is a real requirement for data processing here 


. also; the input from questionnaires requires three alpha .. 
8orts. Just before the cutoff deadline for input; the entries . - 
~ of advertisers must be changed to bold face type and ·. 
sequenced separately. And finally, there is the require- | 


E ment for updating once a year by incorporating any 
. changes in the manufacturer's name and address, phone 
number, corporate datá, addresses of representatives, or. 


products made. The potential for saving by computer was | . 
во attractive that a decision was made to рю it this . 


. Way for the first time in 1965. Е 
Since ете was no precedent for producing a елор 
22 of this size by computer, many decisions had to be made 
< in éonjunetion with an:exhaustive systems study. While 
, describing the procedures finally adopted, the problems 
. and. options will be: covered for guidance i in UE 
- similar jobs. Here are the main steps: 


15 Assign Code Numbers. - First of all, the 7 ‚000 manu- 


facturer’ names were arranged manually in the “desired - 
alphabetic. sequence and fed into a computer for auto- ` 


matic кишен of 6-digit manufacturer.: code numbers, 


2E P 
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with a spacing of 125 between the numbers. This permits’ 


‚ insertion of new names in correct alphabetic sequence in . 


future years. Each new name is given’a number halfway 


between the numbers of adjacent names to leave room ` 


for inserting further names. Product headings were simi- · 
larly tape-punched' and given 5-digit codes. 7] 5 
When assigning codes, the computer also adds a check 


‘digit at the end of.each number to permit automatic . 
: computer detection of errors in typing of code numbers. 
. If an error is made in a code number, the computer will 


get a different check digit and thus detect the errors. 
2. Prepare Typographic Specifications. This is a much 


> bigger job than it sounds. Rules must be established for 
‚ each detail of type size, style, special characters, punctua- 


tion, indents, capitalization; handling of turnover lines, | 


| column widths, spacings between lines, leader dots, rules E 
for breaking an address when it will not all go on one 


line, etc. The entire input must be divided ‘into logical B 


' fields, and the maximum number of characters in each 
field must be specified. Fields were made as small as 


possible; examples of fields are the corporate name, street 
address, city, state, ZIP code, phone number, and number . 


“of engineers employed. The importance of éstablishing. . 
' specs and field lengths beforehand cannot be emphasized, 

· too much, because changes in computer progtamming can 
. be very costly. | 


The column width’ Was 200 ainits of 6- -point type which 
came out to be 14.1527 picas. Character widths were 


- specified to-thirds of a unit, во each line was made up: 


` of 600 increments. The width, of each character 1 in incre- - 


ments aha: to be determined and fed into the. шй 

first. . 
Instead of justifying: right-hand margins of ‘columns һу 

changing word spaces and hyphenating, the first, lines of 


| ^. gn entry were left short when the next field wouldn't fit. 
. 800 pages in print — в tremendous explosion of informa-  ' 


For the last line of each entry, leader dots were specifiéd. 
to make the phone number or state abbreviation come out 
flush right. This illusion of justification does. not look too 
bad, as can be seen in Fig. 5. 
‚8. Choose a Computer. 
24K Honeywell 200 computer in the McGraw-Hill Data 
Processing Center in Hightstown, New Jersey, used with 


-a Honeywell 500-character-per-second' paper tape reader 
'. and a 100-character-per-second paper tape. punch. 


4. Choose Output Machines. Here the choice waa the 
American Type Founders B-8 paper-tape-driven photo- , 
composition machine, which is reliable even though pain- ' | 
fully slow (about as fast as an average typist, at 5 char- 
.acters per second). The slowness is an ‘asset, however, ` 
because the four machines required to meet the produc- 


‘tion schedule provided backup for each other in the event , 


of trouble. Special 176-character type discs had to be 


Е designed and madé to get the variety of type fonts arid. 


Sizes required (6-point c & le, 
8-point b-f caps). . ; 
: 6. Select Programmers. Programming was contracted 


6-point, b-f caps, and 


to National Computer Analysts in Princeton; New Jersey. 


П 


Тһе decision hèrs went to & ^ 


Pa И { Duncan & Assoc, 5452 Charles St, Phila...:744-6110 
Тех Cain & Co, 435 Bainiff Airways Bldg, Dallas.....357-8645 
Wash Stanley Enterprises, 127 S River St, Seattle ...723-3320 


WHEELER LABORATORIES INC, 122 CUTTER MILL RD 
GREAT NECK, М Y. esses 516 482-7876 
H WHEELER, PRES, , EMP 111, ENG 62, $1,000-2,500M 








WHEELOCK SIGNALS INC, 273 BRANCHPORT AVE 
; LONG BRANCH, N у... e 201 222-5880 
R NEWMAN, 515 MGR, EMP 153, ENG 18, 52,000-2,500М 


Ala Emory Design 4 Equip, 404 Dexter Ave 


Birmingham ..........а.ма4м4лМ22 ыы TR 1-1369 
Н Ariz Stepco, Box 3109, Ѕсойѕдаћ№е.......................... 945-4925 
Cal Web Electronics Assoc, Box 45734 
Los Angeles... rre 571.8297 
Cal RA Banks Sales, 1000 Acacla Ave, Los Altos...941-0900 
Colo Stepco, 216 Clayton St, Оепмег........................ 388-9302 
Conn Hatch-Hutchinson Assoc, Box 12 
Manchestet ER -- MI 3-0863 


Fla NBS Inc, 3524 Devonswood Dr, Orlando 
] Fla NBS Inc, 3931 SW 5th Terr, Mlaml................... 
ІШ Electro-Comp Sales, 5129 W Devon 

[nir e T ытын бымен 







of justification. 





This firm was chosen chiefly because their programmers 
had heavy experience in computer composition. 

6. Punch Input Paper Tapes. This input punching 
was farmed out to the programming firm to achieve single 
responsibility for establishing and punching the necessary 
control codes. Here it was found that the hard copy did 
not always correspond to the punches made by the two 
Dura Мћећ 10 tape-punching typewriters. Accordingly, 
the tapes were fed back into the readers of the Duras to 
produce "пет hard copy for proofreading against the 
questionnaires. Error correction required very little addi- 
tional keyboarding, because only the manufacturer code, 
the field number, and the new corrected wording for the 
field had to be punched. The average length of a directory 
field is only about three words. 

7. Produce Line Printer Proofs. Batches of the 
punched ‘tape were fed into the computer along with 
the correction tapes for converting to magnetic tape, 
merging corrections, sorting records into final sequence, 
making validity checks of code numbers, counting char- 
acters to make sure no field was longer than the maximum 
length previded for it, and printing proofs. Since the line 
printer could print only capital letters, some means of 
identifying true capital letters had to be used. The de- 
cision to print an open lozenge under each true capital 
letter, as in Fig. 4, worked out very well from the proof- 
reading standpoint, even though it doubled the line 
printer running time. 

8. Punch Output Paper Tape. After coded instruc- 
tions for bold-facing of advertisers had been punched and 
inputted, the computer selected the fields of data needed, 
processed these as required for the final sequences in the 


Power - Tronic Systems Inc, 10-12 Pine Ct, New Rachelle...N Y 
Radio Corp of America Broadcast & Communications 

Prods Div, Front & Cooper Sts, Camden... sees NJ 
Spring City Electrical Mfg Co, 5 Main St, Spring Cily............. Pa 
York Metal Products Inc, 34-20 12th St, Long Island City... N Y 


HUMIDITY INDICATORS See INDICATORS--HUMIDITY 


HUMIDITY RECORDERS See RECORDERS--HUMIDITY 


HYBRID JUNCTIONS--MICROWAVE 
BUDD-STANLEY CO INC, 175 EILEEN WAY 


SYOSSET ,iascicdiaisue ipia ыдан N Y ADV PGS 746, 747 
DORNE & MARGOLIN INC, 29 NEW YORK AVE 
ЖЕЗТВОМҮ.. снаа 





Adams-Russel Со, Inc, 280 Bear Hill Rd, 
Alnslie Corp, 531 Pond St, Braintree...... 
Aircom Inc, 48 Cummington St, Boston 
Airtec Inc, 264 Columbus Ave, Козв!е.................................. NJ 
Airtron, Div of Litton Industries, Inc, 200 E Hanover Ave 

Morris Pains. ees etcetera esee reete oen 
Allied Research & Engineering Co, 10300 Glasgow 

Los Angeles 





Fic. 5. Typography of Manufacturers Section (left) and Product Section (right) of 1965 Elec- 
tronics Buyers’ Guide, as produced by running computer-produced paper tapes through ATF model 
B-8 photocomposition machines. Computer is programmed to insert exactly the correct number 
of leader dots ahead of phone number and state fields to make these line up at right, giving effect 


two sections of the directory, and added the necessary 
control codes for the photocomposition machines. Output 
tapes were then punched for the Manufacturer’s Section 
in 21 hours of Honeywell 200 computer time and for the 
Product Section in 35 hours, to give a total of about 
28 miles of 8-channel paper tape. 

9. Convert Tapes to Repro Positives. The output of 
paper tape was run through three ATF B-8 machines to 
get paper prints in about 20 days of three shift operation. 
(An additional machine was kept in the publisher’s plant 
for making last minute corrections and to serve as backup 
for those in the printer’s plant.) The paper repro prints 
were dummied along with ad proofs, then photographed 
to get page negatives for making offset printing plates. 

Next Edition. The magnetic tape reels containing the 
input data were stored, for two purposes: (a) to produce 
individual questionnaires for acquiring data for the next 
edition, with addresses in correct positions for window 
envelopes; (b) to repeat unchanged material in the next 
edition without additional keyboarding. Customized ques- 
tionnaires make it easy for manufacturers to check what 
they had in the previous directory and mark the changes 
desired. It is estimated that only about 50 pages of new 
input will be keyboarded in 1966 to get 800 pages of 
output. 

Other Directories. Chilton’s Hardware Age directory 
was produced on a computer in 1965, using the facilities 
of Rocappi in Philadelphia. The hardware consisted of 
Dura Mach 10 tape-punching input typewriters, an 
RCA 301 computer, and a Photon 513 output photo- 
composition machine. The end product was 329 pages of 
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composition in, 6-point type, having the typographic 


format of Fig. 6. 


9 Book Catalogs for Libraries 


Computer composition techniques have made it possible 
for many libraries to replace their catalog card files with 
much more convenient book-type catalogs. These are 
usually updated annually, and cumulative supplements 
for new material are issued quarterly. The catalogs can 
serve a number of main libraries as well as all branch 
locations. 

One example is the combination catalog serving the 
medical libraries of Yale, Harvard, and Columbia, pro- 
duced by computer under the direction of F. G. Kilgour 
of Yale. 

Another significant example is being produced by Docu- 
mentation, Inc., for the Baltimore County Public Library. 
There are three hardbound annual volumes, for Title, 
Author, and Subject, containing 1,500, 1,500, and 1,700 
pages respectively to cover 50,000 titles. Press run is 
about 100 sets. Here the cap and lower case printout of 
an IBM 1403 chain printer is reduced photographically 
to 8-point for offset printing to give the format shown 
in Fig. 7. Punched cards are used for input. Processing 
is done on an IBM 1401, as also are sorting, breakout, 
and cumulating of the paper cover quarterly supplements. 


IRRIGATORS, Soil 
Allen W D Mfg Co 650 5 25 Av Вә! сод 10 
Canvas Kid—See Convas Products Co 
Canvas Products Co 2115 Locust St St Louis 3 Mo 
Hastings Canvas & Mfg Co Hastings Neb 
Jons Mig Co St Matthews SC 
Reseorch Products Corp 1015 E Washington Ау Madison 10 


Wh 
Rose Sock Rod—See Allen, W D Mfg Co 
Ѕоскехе — 5ее Jons Mig Со 
Soil-Soaker—Ses Hastings Canvas & Mfg Со 
Spot Soaker—See Research Products Corp 
Turfgrass Farm 4941 E 22 $t Tucson Ariz 
Wagner Awning & Mfg Co 2658 Scranton Rd Cleveland 1 
Water Bubbler—See Turfgrass Farm 


IRRIGATORS, Sub-Soil 
Allen W D Mfg Со 650 5 25 Av Bellwood Ш 
Ansan Tool & Mfg Co Inc 4750 М Ronald Av Chicago 31 
Birch Mfg Co 1521 Sedgwick St Chicago 10 
Hubbard Mfg Со 2668 Territorial Rd St Paul 14 Minn 
* Proen Products Co 9 & Grayson Sts Berkeley 10 Са! 
Root-A~Gators— See Ansan Tool & Mfg Co Inc 
" Root Feed— Ser Witton Plastics Inc 
Root Irrigator—See Allen W D Mfg Co 
Rost—See Ross Daniels Inc 
Ross Doniels Inc 115 SW 8 St Des Moines 9 lowa 
Speclalty Mfg Co 2356 University Av St Poul 14 Minn 
ж Waterspike—See Proen Products Co 
Wilson Plastics Inc Div Foster Grant Co Inc 400 Broadway San- 
dusky О 
ISOLATED LIGHTING MANTS—See Lighting Plants Farm 
Electric ` 
JACK HANDLES —See Handles Logging Тоо! 
JACK KNIVES—See Knives Pocket 


JACK PLANES—See Planes 


Ела. 6. Typography of Hardware Асе 
ав produced by а Photon 513 from com- 
puter-processed information. Bold-faced 
lines preceded by star indicate adver- 
tisers. Cross-references for trade names 
are alphabetized in same sequence with 
manufacturers. This directory has only а 
product section. 
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INFANTS--CARE AND HYGIENE 
Prudden, Bonnie How to keep your child 1 
from birth to six с1964 
0165-13515 
INFORMATION SERVICES 
Cossman, E. Joseph How to get 50,000 dol 
worth of services free, each year, from 
U.S. Government 1964 
0365-16495 j 
Kent, Allen Centralized information ser: 
1958 
0265-15398 
INFORMATION STORAGE AND RETRIEVAL SYSTEMS 
Foskett, D. J. Science, humanism, and 
libraries 1964 ' 
0265-15102 R 
0165-14385 Ref 
Jonker, Frederick Indexing theory, inde 
methods and search devices с1964 
0165-12676 
Conference on libraries and automation, 
foundation, 1963. Libraries and automat 
0365-16479 Ref 
Licklider, J. C. Re Libraries of the fu 
1965 
0365-17100 
Metcalfe, John Wallace Information indd 
and subject cataloging с1957 
0165-13195 Ref 
Perry, James Whitney Tools for machine 
literature searching c1958 
0165—13441 
Perry, James Whitney Machina literature 
searching 1956 
0365-17368 
Simonton, Wesley C. Information retriev 
today 1963 
0165-13882 
Weatern Reserve university, Cleveland. 
of library science Information systems 
documentation с1957 
INFORMATION STORAGE AND RETRIEVAL SYSTEMS- 
DICTIONARIES : 
Honeywell, inc. Glossary of data proces 
communications terms 1965 | 
0365-16903 Ref 


i 
i 





Ела. 7. Portion of October 1965 supplement 
book catalog that now replaces catalog card fi 
more County Public Library gystem, as produce 
1403 line printer having cap and lower case chain 


Other computer-processed book catalogs in 
of the University of Toronto Library and 1 
Atlantic University Library. A number of; 
dustrial libraries are using the same computen 
but using the line printer printouts directly 
because the three or four copies produced in + 
adequate for their needs. 





• Summary 


Computer composition is economical оу 
types of directories, indexes, cumulated bib! 
and other works involving significant amount ’ 
and other manipulation of input information. 

For straight text that requires. only justifica, nd 
hyphenation, it is much more difficult to promi. cost 
savings at this tme. Here also there is a major tech- 
nologieal problem — the inability to produce at computer 
speeds a printout that will be acceptable to proofreaders, 
editors, and authors in place of conventional printed 
galley proofs. The cathode-ray approach to character 


generation does offer promise here as & proof printer. 
Cost and speed of the machine should approximate that 
of a line printer, but it must be able to give easily read- 
able proof prints in a wide variety of type faces, sizes, 
&nd special characters. 

Another important deterrent to widespread adoption of 
computer composition 1s the high cost of software. Work 
on basic compiler programs having sufficient flexibility 
to handle different jobs without reprogramming is now 
under!way. Costs are high, well up into six figures. If 
these programs can be made sufficiently universal to 
permit amortizing cost over а large number of jobs, the 
number of economical applications for computer composi- 
tion should increase tremendously. 
| Тар hardware problems are being solved at a satis- 
factory rate with & variety of keyboard units that punch 
paper tape with or without hard copy. Some keyboard 
units even produce magnetic tape, but the higher cost 
per machine may preclude widespread use. 

The, output photocomposition hardware picture also 
looks more encouraging in 1966, now that Photon 713's 
and Harris-Intertype Fototronics are in actual use in 
printing plants. These machines sell for about four times 
ihe price of an АТЕ В-8, but provide a greater variety of 
type fonts and sizes along with higher operating speeds. 

Cathode-ray photocomposition machines are appearing 
also this year in Europe as well as this country, but it 
remains to be seen whether they can consistently and 
reliably provide the graphic arts quality required by most 
book and magazine publishers. 
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ington: Printing Industries of America. Chapter II, 
Section 6 contains 32 pages dealing specifically with 
computerized composition. 


. Proceedings of International Conference on Сотрићет- 


ized Typesetting, March 2-3, 1965. Washington: Re- 
search and Engineering Council of the Graphic Arts 
Industry, Inc. 157 pages. Nineteen papers plus 
discussions. 

Computerised Typesetting: Interest Runs High. Pub- 
lisher's Weekly, April 5, 1965, pp. 62-68. Report on 
Mar. 1965 Research and Engineering Council Confer- 
ence on Computerized Typesetting. 

MarmEgws, M. V., and Мплев, Joan E. 1965. Com- 
puter Editing and Image Generation. AFIPS Confer- 
ence Proceedings — Fall Joint Computer Conference, 
1: 389-398. Washington: Spartan. 

Proceedings of Computer Typesetting Conference, Lon- 
don University, 1964. 1965. London: Institute of 
Printing Limited. 245 pages. 

New Equipment and Trends in Automated Composi- 
tion. 1964. Book Production Magazine. 80 (12): 
36-39. 

Benner, Davi. The Case for Unjustified Typsetting. 
British Printer, October 1964. 5 pages. Reprinted by 
Composition Information Services, 1605 N. Cahuenga 
Blvd., Los Angeles. 

GARDNER, ARTHUR E. The Age of Computerized Type- 
setting .. . Phase 2. 1964. Printing Produciton, 95 
(10) : 48-53. 

Getting Started in Computer Composition. 1964. Book 
Production. Magazine, 80 (9) : 52-54. 

WrzrNBTEIN, Epwarp А., and Spry, Joan. 1964. Boeing 
SLIP: Computer Produced and Maintained Printed 
Book Catalogs. Ат. Doc., 15: 185-190. 

Onnrmwazg, Les. Computer Input from Printing Control 
Tapes. А paper presented at the 16th Annual Meeting 
of the Technical Association of the Graphic Arts, 
Pittsburgh, Pa. June 3, 1964. 13 pages. 

Barnett, MicHaEL P., Moss, D. J, and Luce, D. А. 
1964. Computer Generation of Photocopying Control 
Tapes. II. The P C 6 System. Am. Doc., 15: 115-120. 

Part 3: 


Concept — Key to Computer Profits, Management 
and the Computerized Future. 
Magazine, 79: 55-61, (April 1964) and 67-73 (May: 
1904. , 

беүвош», Јонм УУ. 1964. Тһе ROCAPPI System for 
Computerized Composition. Book Industry Magazine, 
1 (3): 42-45. 


. НошдрАү, Aran S. Computer Controlled Composition 


for Books. Part 1: Concepts, Systems, Machines, 
Manning, Problems, and Solutions; Part 2: Applying 
the RCA 301 Computer to Book Typesetting. Book 
Industry, 1: 22-25 (Feb. 1904) and 28-31, 78. (Mar. 
1964). 


. Computers: Their Impact on Book Composition. Spe- 


cial 32-page report reprinted from Feb. 1964 Book Pro- 
duction Magazine, containing five articles: "Computers 
in 64: Year of Transition from Theory to Practice"; 
“Kingsport and Computers: А Book Manufacturer's 
Experience in Composition Research”; “What’s Ahead 
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Book Production me 


for  Computers?"; “Computers Are Here-What 
Now?”; “The Systems Concept — Key to Computer 

. Profits.” . : : 
‘24, An Introduction to Computer Typesetting. Part 1: 


Basic Computer Principles; Part 2: The Automation - 


of Typesetting in Application. Print in Britain, 11: 
20-22. (Jan, 1964) and 11: 27-82 (Feb. 1964). 
25. Воскракр, Lawrence F. The Recording of Library of 


Congress Bibliographie Data in Machine Form. May- ' 


nard, Mass.: Inforonics, Inc. 48 pages. 


.26. Duncan, C. J. 1964. Look! No Hands. Penrose An- 


nual, 57: 121-167. 
27. Typesetting in the Computer Age. 
Britain, 121 8-page supplement. 
28. GaRDNER, "ARTHUR E. 1964. Computerized Typesetting 
— A Management Report on the State of the Art. 11 
' pages. Composition Information Services Newsletter, 
Los Angeles, Calif. 
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2 | 32. Змітн, Евамк Н. 
1964. Print in 


29: Duncan, C. 7., Могтмкох, Eve L, Paa, E. S., and 


Rosson, М. С. 1983. .Computer Typesetting: .An 
-Evaluation.of the Problems. Printing Technology, 
133-151 (Dec. 1963). С: ; 

30. Возмам, WitL1iaM R. 1963. Phototypesetting of Com- 
puter Input. NBS Technical Note 170. Washington: 
U. S. National Bureau of Standards. 6 pages. 


| 81. Barnett, МїснАкһ P. апа Kmaay, K. L. Computer 


Editing of Verbal Texts. Part 1. The ESI System. 

` Am. Doc, 14: 99-108 (April 1963) ; 15 (2): 115-120 

(April 1964). ~ 

1963. Computers and Composition. 
Mod. Lithographer, 31 (1): 37-44. | 

33. Norra, Автнов. 1963. Quality Typography from 

Computer Data. 12-page booklet. Washington: U. 8. 

Patent Office. Covers computer conversion of all-caps 


input of directory names and addresses to cap and, 


lower cage output. 


ani 


Cost Distribution and Analysis in Computer 


Storage and Retrieval’ 


A method for costing computer jobs done by а 
mechanized storage and retrieval activity is proposed 
and discussed. Attention is confined solely to com- 
puter; costs. The Science Information Exchange, a 
mechanized installation handling information on re- 
search in progress is used as the case in point. All 
computer jobs are grouped as batched, singly run or 


| 
| 
| 
e. Tah eduction 


Despite the widespread interest in the economics of 
computer storage and retrieval of scientific information — 
specifically the cost of performing searches and preparing 
reports|— there is little definitive discussion of this subject 
in open literature. A few operating installations have re- 
leased data on unit search costs, usually in terms of com- 
puter time per job obtained by dividing total computer 
processing time for a batch of jobs by the number of jobs. 
By doing so, however, the costs for separate jobs or build- 
ing, modifying, maintaining, and updating the file are 
neglected. Depending on file array, maintenance proce- 
dures, and the file update frequency, these costs may be 
large and can rarely be disregarded. It can be shown that 
this is especially true if the search files are updated often 
but seldom searched. 

This paper is confined solely to a technique for dis- 
tributing computer costs. This is not to say that other 
associated costs, such as initial file design, programming, 
or general administrative overhead are unimportant. Quite 
the contrary. They are very important and in the last 
analysis must be brought into the computation for the 
true reflection of costs. However, the distribution of costs 
is complex and relatively uncharted and in our opinion 
the building block approach to a final solution in this area 
is preferable to an over-all assault on the total problem. 


1 Presented at the 1965 Congress of the International Federation of 
Documentation, Washington, D. O., October 15, 1065. 

3 Assistant Director, Operations, Science Information Exchange, Smith- 
sonian Institution. 

* Deputy Assistant Director, Operations, Science Information Ex- 
change, Sinithsonian Institution. | 


maintenance tasks. Job unit costs are calculated with 
and without inclusion of flle maintenance costs. Should 
other activities compute their costs similarly, inter- 
activity cost comparisons can be made readily, 
opening the door to cost-quality criteria for mecha- 
nized searches and report preparation. 


HARVEY MARRON ® and MARTIN SNYDERMAN, Jr.? 


'There are two aspects of computer jobs costs which 
must be considered: (1) the direct costs of doing а specific 
job and (2) the indirect costs of maintaining (not to be 
confused with the initial building) the files which are used 
in doing these jobs. Both aspects will be discussed in this 
paper. 


9 System Description 


The Science Information Exchange (SIE) is a clearing- 
house for current research in the life, physical, and social 
sciences. Information within its scope of coverage is col- 
lected, indexed, stored, and made available on demand to 
interested members of the scientific community. Request- 
ors for information have ranged from bench level sci- 
entists to highly placed scientific program managers/ad- 
ministrators. 

Essentially, information on six aspects of each research 
project is acquired, indexed, and stored. These are: 

1. The supporting or funding agency. 

2. The professional investigators. 

.9. The name and location of the researching institu- 
tion. 

4. The project title and a 200-300 word technical de- 
scription of the research. 


5. The period or beginning and end dates. 
6. The annual funding level. 


This information is put into the system on a Notice of 
Research Projeet (NRP) which is the unit hard copy 
records of SIE. Approximately 70,000 notices of current 
research were received by SIE in 1965. A sample МЕР is 
shown in Fig. 1. 
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NOTICE OF, RESEARCH PROJECT 
SCIENCE INFORMATION EXCHANGE 
7 SMITHSONIAN INSTITUTION 


31-316-78 A; REV. 6-64 






sig NO. 4 
SO A 7 а 
AGENCY но. р 2% 
| GN-433 MO 
TITLE OF PROJECT: 


-The кайыр Decimal Classification as ап Indexing манан for Mechanized 


trieval System 
Give потез, paese and olficiol titles of PRINCIPAL INVESTIGATORS and ALL OTHER PROFESSIONAL PERSONN EL ed on the project. 


Mrs. Pauline Atherton, Associate Director of the 
AIP Documentation Research Program 





NOT FOR PUBLICATION OR 


PUBLICATION: REFERENCE % . ч | 
National Science Foundation 


Office of Science Information Service 
Information Systems Programs 





SUPPORTING AGENCY: j 





NAME AND ADDRESS OF INSTITUTION: 
American Institute of Physics - m East 45th Street 


rk, М, Y, 10017 


. SUMMARY OF PROPOSED WORK – 5% words or less.) — |п the 2 anaes Exchonge summariés of work In progress are exchonged with. 
рене and private agencies supporting research, ond оге forwarded to specs who request such Information, Your summary is to be used 


these purposes, 


The principal objective of the proposed project is to explore the problems of using 
the Universal.Decimal Classification scheme іп а mechanized information retrieval system 
by mechanizing the English UDC Schedule, by developing an experimental computerized ) 
reference retrieval system with the UDC as the indexing language, and Бу ee the | 
, UDC in comparison with indexing languages in other mechanized systems. 


The experimental sen proposéd is intended to. demonstrate the capabilities of the 
UDC as an indexing language as it is being applied in real life situations. Іп evaluating 
‘the retrieval effectiveness of the UDC and indexing languages in other mechanized systems `. > 
the design of the retrieval tests will be carefully constructed in order to insure com- 
parable results and proper assessment of relevance by user group representatives. А pro- 
posed standard description for evaluation tests will be followed and a common corpus of 
. documents in each mechanized system tested will be used. ae 


Products of the project will include the most complete existing English edition of the . 
UDC available on punch cards and magnetic tape; programs for creating, updating, and 
printing.schedules as well as alphabetic indexes; and a flexible computer display and 
search program. ‘These products will be available for further research on the value of 
the UDC as an indexing language and for operational use. | 


T 


Period Amount dos Period Amount 


No. | 
ASA Еилат 
ILC 167 0107 500| 


Fu. 1. Notice of Research Project. 


2. Agency Files — A single NRP is filed = 2: agency 


. Notices are stored in several files: 


1. Subject Files — À single NRP is filed under each 
subject index point to which the МЕР has been. indexed. 
This provides a source for rapid reference and/or retrieval 

‚ for requests not requiring use of Ше computer. 
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supporting research and thus provides an internal refer- 
ence source independent of the computer operation. ^ ` 
3. МЕР Stacks — Multiple copies of each МЕР are 


‘filed alphanumericall y by accession nümber. This file is 


used to supply NRP's im hard.copy when PUE have 


been identified by computer йы of the tape files © or 


manual searches of the subject files. It is, per ве, never . 
used for search purposes. . К 
.| Data from each. МЕР is also placed on appropriate 


magnetic. tape files: 


1. The Master File— An alphanumerical listing by- 


SIE assigned accession number of a project record cor- 


Bee Fig. 2. 
2. ‘Index Fi ile — A dictionary which contains codes. 


jand corresponding captions for крг "Bgen- 
gies, locations, and subject index points. 





| 3. Contract No.-SIE No. Cross Reference File — 
vides for manual crossover between agency identifica- 
tion and the accession number used by SIE. 

| 4. Title File—- An булун рен а file of titles for all 
projeets on the master file sequenced by SIE assigned 


Jaccession number. It is used војеју to attach titles to 


special reports. 
! 5. Investigator File — Investigator names arranged al- 


‘| phabetically. It is used to respond to requests for all the ` | 
нг on which particular investigators are-working. 


; 6. Pending Projects File — A list of proposed or pend- 
ing pr djects. These records are addressable by accession 


r, investigator, and researching нано but ате. 


z indexed to subject matter. 
Except for searches on investigators’ names, {е master 
file is used for all mechanized searches. кыр проп 


Accession Number “(Includes supporting 
4— ——— agency code). | 


— Principal Investigator (РІ). 


ићи Сода for PI ша State, Institution, 
—— and if пен School and Department). 


4 192 24 403 
4 192 24 720 <———— Location Codes for OI 
4 192 24 240 j | 
390 17995 ` (C 
390 25 705 : 
2491587 ; 
600 15 800 15 90 
600 95 500 
610 87 
758 55 100 10 350 
99 20 100 | ! 


2 
1. 
SübJect Index Codes 
<——— (А 5 level helrarchical structure) 








8001 58 


_ Fra. 2, Schematic magnetic tape record formiat. 


responding to a МЕР. For a schematic format description Р 


; t is used for. 
validating input codes and attaching captions to aa | 


the ‘nature of the request, any or all of the other magnétic | 
. tape files may be brought into play. Some actual questions 
` are shown in Fig. 3. In general; straight-forward subject, 


location, or agency searches need use only the master tape 
file because upon identification a list of accession numbers 
for the pertinent records is printed out. The NRP’s are 
then pulled from the МЕР stacks. If, however, a table or a 
compilation is to be generated in which English captions, 
titles, funds, or contract numbers are required, the other 
files are needed. These tasks almost always require pro- 
gramming and/or separate machine runs. These are the 
singly. run jobs to be discussed more fully later. 


‘SAMPLE QUESTIONS 


Typical Subject Requests 


1. Oxidations and autoxidation of long chain fatty acids. 
` (SIE found 54 projects representing 19 different 
Sources of support.) 
2. "Imege covariance factor, analysis. 
(SIE found 13 projects BETE 7 different 
.' - sources of support.) З 
3. Psychological stress in heart disease. 
: (SIE found 36 projects representing 11 different 
3 sources of support.) . 
4. Electrochemistry of iron porphin complexes— eg., 
. Electron transfer rates, equilibrium, and formation 
rate constants. 
' (SIE found 23 projects representing 8 different 
sources of support.) | 

Typical Administrative Requests ` 

1. А tabulation of all federal grants (or contracts) to 

. BÜ' select universities showing ‘the total number of 
grants апа funds to each department within the uni- 
versities, prorated by Supporting Agency. Include sub- 
totals for each graduate school. , 

2. Ап alphabetical listing of 1,400 neurosurgeons showing 
the research projects in which they are currently par- 
ticipating reflecting the title, dates and funds for each 
project. 

3. Ап inventory of all current Public Health Service 
grants to Schools of Pharmacy. | 

4. Ап inventory of all U. S. supported research in Latin 
American countries arranged alphabetically by іп- 
vestigator within country; listing titles, dates, funds, 
supporting agencies and location of the project. 


Typical Subject — Administrative Requests 


1. Total funds spent on current research in | the cardio- 
vascular field in New York State, 

‚ 2. A list of all current projects dealing with health re- 
lated’ research in Foreign Countries which identifies 
the source of support. 

3. All current studies.on leukemia supported by Federal 
' Government excluding USDA and NIH. 





Ела. 3. 
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ө Computer Array 


Until May 1965, STE had а 16K, ТВМ 1401 computer. 
. The hardware presently used is а 6 tape, 16K, IBM 1460 
computer with -a card-read punch, a 600 lines/minute 


printer, and a console typewriter attached.* The tape. 


. units are 729 model 5’s operating at 800 bits/inch with a 
transfer rate of 60,000 characters/second. There i ів по disc 
. capability. 


2% Job Categorization | 


With very few exceptions, the tasks performed by the 
' computer activity in an information center can be grouped 


for costing purposes into three types of jobs: batched, , 


singly run, and file maintenance. Following,is a discussion 
of how this grouping is performed at the Science Informa- 
tion Exchange. 

1. Batched Jobs. These include all tasks which are 


grouped so that they can be performed concurrently dur- 
: Ing a single pass of the master files. Obviously, for econ- 


omy and: faster turnaround times, jobs are batched when- . 


. ever possible; At SIE, almost all batched jobs involve 
multiple selection criteria and the superimposition of sev- 
„eral boolean statements (matches). They are perhaps 
:equivalent to conventional PUE DUE compilations in- 
volving selection and organization by subjects, authors, or 
other search’ parameters. Fig. 4 shows a sample of an 
actual question asked of SLE and the corresponding “Re- 
. quest for a Computer Run" which was comple 
"Scientific analyst. It is a typical "batchable" question 

which involves four subject search terms (or paneer) 
and two matches (part B of the request). . 

Presently, programming and. machine configuration 
limit the atch size to 150 search terms (ie., subjecta, 
locations, or supporting agencies). This may be one ques- 
tion with 150 parameters or 150 questions with one pa- 
rameter or combinations thereof. | 

The cost of the individual job is computed by distribut- 


' ing the total batch processing time етші each job in the . 


batch in proportion to the number of subject terms each 
contains. Fhe time is then multiplied by the hourly com- 
puter cost. It is, of course, recognized that this picture is 
oversimplified because there are other factors which affect 
‘total computer time per job. Analysis thus far, however, 


‘indicates that the number of search terms appears to be ` 


ihe dominant factor influencing job times. 
2. Singly Run Jobs. This covers all tasks which use the 
- files but which occupy.the total data processing activity 
' while being run. These jobs vary from very simple to 
extremely complex. Singly run jobs are performed as such 
only, because either programming or machine limitations 
preclude batching. In such cases where part of a job is 
run in a batch and then selected material is further ma- 
. ehined in order to arrange the information as required, it 
is counted as a singly run job with the batched time added 


to the time logged while it was being handled as a singly · 


run job. Com pm involving several types of informa- 

tion (e. g., subjects, locations, and funds) almost always 
require special formating and are therefore handled ав 
singly run jobs. кы in which textual material or 


*In spring of 1966 this equipment will be replaced by IBM 200/86 LU 


equipment. 
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` titles ав well as subject, investigator, and location indexes. 


ате involved also must be handled as separate jobs.and 


^ come within this category. 


3. Maintenance, Update and Research Tasks. These are’ 
the computer runs which contribute to ‘the quality ала” 
currency of the search files and the retrieval programs. 
Included are update runs and error correction passes оп' 
all files in current use as well as research and development‘ 
of new file arrays or search programs. At SIE, the master 
file and all satellite files are updated every two weeks. 


- This in itself is a major computer time commitment but 


is considered worth doing because stress is placed upon 
having an up-to-date master file which is as free from 


' errors as is reasonably possible. Also, new and different ' 


demands require constant experimentation with new files ` 
and redesign of old -ones for quicker, more a 
responses. 


e. Computation of Unit Costs 


ЛЕ during a given period of time the total number of 


computer hours -T is to be accounted for by the hours 


spent on file maintenance and research M, batched jobs B, 
and singly run jobs 8, then obviously 


T=M+B-+5 (hours) 


If D is the total dollar cost of the computer installation 


for € period, then 


d-T — Average hourly cost (dollars per Барр, | 
hour) and C,, —Md, Св — Bd, and Cg= Sd are the costs of | 


running maintenance and research о bátched and. 


-singly run jobs respectively. 


If during this time period there are n, batched jobs | 
and п, singly run jobs, then. 
C, свв) 
n,\ job 


p and e,— 





' where с, and c, are unit costs for batched and singly rùn. 
" jobs respectively. 


It should bé noted that if the machine ів under-utilized 
in this period i in the sense of paid for time while the ma- 


' chine is idle, this is factored into the computation via D 
ала thence to d, which represents the over-all cost. The 


burden 18 thus distributed over-all the activities. 
However, it is neither fair nor realistic to consider су 


‘and e, as inclusive unit costs. File maintenance, update’ 


and research tasks are support activities which are done 
in order that batched and singly run tasks be performed 


efficiently on a current and accurate file. Therefore, it n 


seems reasonable to lay off the file maintenance and search 
program costs against those jobs which use these files. 


` Further, this lay off should be in proportion to the use of: 
the files in which the maintenance investment is being 


made. 
Accordingly Cy can be separated into PR parts: 


Сы- (sis) быз (sis) Сы 


Тће adjusted unit cost for batched and singly’ run jobs ~ 


: ]ob* 1199 
Cost Code S 


Request for Computer Run = 








1 Routing WY 
SIE 


eguestor 















Directorate 501-1000 








| п Priority Flexible: . Firm: | X | Note: Іп order to insure that 
| ____ listing Only Note: Flexible jobs Due 10/10 . Шегеіз machine and program- 
| ordinarily are handled Mo. Da. ming time available, review 


| X Listing & Pull 
rd M requested date with Chief, OD 

before commiting SIE to a 

delivery schedule. 


ен | --. ories E а ES 


(786 | dub 
| пшрш пшпш шыш 
| 


List Categories . X Separately "Include X Specified Include X Intramural 
. АП X Extramural 


Together Жы ЕЕ, 


on a first.come, first 
serve basis. 










Break | Break | Break 




















. | | ms i | Subjects оп List 


Special Instructions: 








\ ГА. List all records in 1. 
| B. List all records coded to 2 and 37 
List all records coded to 2 and 4. 








| 











1V. |Раге Span 
Projects active as of / апа later FY 1963 and later 
| А Мо. Үг. 
NI 
V. Qutput Specifications 
Number of copies of listing for requestors use 1. j 4. Pull X continuation listed 
,Willlisting be forwarded? No . . PEE X latest continuation 
^ Return, . соріев of each NRP іп - Separate Stacks 5. Include microfilmed NRPs No 


\ : | ____Опе Stack 6. Remove Status Yes 





Fig. 4. Th eater: for Сошршег Run for the question, “All research records dealing with pleuropneumonia-like organisms 
and aceta 8 metabolism of tissue cells.” It was prepared by an SIE scientific analyst. The “listing” is & computer-generated list 
of the р inent NRP accession numbers, and the “pull” is the stack of N RP's. The latter will be forwarded, after screening, to 
the juestioner and the former retained for SIE record Purpose Other special requirements such as time spang or scope 





| | à - _ -Ameriean Documentation — April 1966 93 


dte oa an Roe alban Раја 


3 
= 


« 


ASP as 


Бата 


Ma bade „блв... 


, 
А " 


^ which include a prorated share of file maintenance costs ` 


becomes: | 
4 B | 
) Cr+ ът) © 
o= 22 
са+/ 8 
7 в) < 
Cs = 
Dg 


Upon examination of these formulas, it can be seen 
. that unit costs are functions of several independent vari- 


., ables: total monthly computer usage (which determines . 


d), ratio of the numbers of hours spent on maintenance, 


batched and singly run.jobs, and the number and type of . 


batched and singly.run jobs. Obviously, the computation 

of unit costs under the best of circumstances is no easy 

matter. Indeed, prediction of unit job costs for short 
. periods of time with a mix of jobs is necéssarily inexact. 
. Also, the formulas show clearly what is intuitively 
. expected, that as the ratio of maintenance hours to search 

hours inéreases and-the number of jobs go down the unit 
. job costs increase. . 


-© Operating Experience pt 


Fig. 5 shows the actual operating experience of the 
SIE for the full year 1964 and the first half of 1965. 


Considering the first portion of the table dealing with . 


DIRECT COMPUTER COSTS | 
(No Maintenance Costs ) . , 
|. | Computer Use and Cost Maint. Batched Jobs Singly Rum Jobs Batched Jobs Singly Run Jobs 


His. Біз | орз | 


136 33 











Hs. — Com $/Hr, 


221 11,700 53 {55 
211 11,600 55 - 61 
229 . 11,900 52. 71 
819 13,200 42, 72 
307 13,000 43 58 
285 12,300 43 `· 66 
297 -12,400 42 38 
289  - 12,350 43 67 
279 ' 12,300 44 61 
12, 350 
12,750 







289 12,450 43 | 97 
354 18,150 .37 79 
324. 12,800 40 92 
243 12,850 53 78 
254 13,100 52 48 67 134 226 
28 $ 






| ‚7 76,500 $44° 466 | 631 932 $30 


Fia. 5. SIE éxperience. 
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631 200 5139. 


Direct ‘Computer Costs (ie, no maintenance costs “are 
included), several aspects are worthy of note: : 


1. During the year 1964 the times occupied by mäin- 
tenance, batched, and singly run jobs were 2396, 30.7%, 
and 46.3% respectively. The corresponding ratios during : 
the first half of 1965-were 27%, 36.5%, and 36.5%. The 
trend in 1965 was towards a greater proportion of time 


. spent on batched jobs and less on singly run jobs. The 


computer time spent'on file maintenance tasks during : 
1964 and the first half of 1965 was roughly the same, ' 
about one quarter of the total- computer load. s 

2. The average hourly computer cost was $45 in 19 


' and $44 during the first half of 1965; 'The monthly varia- 


tion, however, was considerable, going from a low of $37 
to a high of $55. Obviously, greater computer use lowers 
the hourly rate. In May 1965, the IBM 1401 central 
processing unit was replaced with an IBM 1460 which 
rented for slightly more than the 1401 but processed all 
of the jobs slightly faster. The average hourly computer 


. rate went up, but the unit batched job costs for May and 
. June remained about the same. Apparently, the higher 


hourly rate was offset by the decrease in processing time. 
3. In 1964, the average direct cost was $36 for batched 
jobs and $131 for singly run jobs. In the first half of 1965 


‚ the costs were $30 and $139. The authors feel the decrease - 
“in unit costs for batched jobs in the first-part of 1965 is 


due primarily to better overall computer management 
and refinement of batched search programs. It is doubt- 
ful that these costs can be significantly further reduced 
without major system modifications. ` | 


А recomputation of unit costs with file maintenance | 
cost laid off in proportion to the use of the file is shown 


` on the right side of Fig: 5 under Total Computer: Costs. 


The batched jobs unit costs jump from $36 to $48 for. 


` TOTAL COMPUTER COSTS - 
(Maintenance Costs Included) 













Hrs, Jobs $/lb Hrs, | Jobs $/Job ` 


126 138 $38 175 33 $220 





$173 




























1 120: 36 $153 

157 A81 52 132 24 236 

184 .21 33. 170- 45 140 

10014 44 165 17 382 

. 144 157 50 99 81 170 

83 134 ^ 322 171 47 187 
1 


871 932 $4 857 200 $190 | 











.formation is displayed and the weight applied to each 
‚ index term in retrieving documents will obviously influ- 


ence the facility of communication, and consequently the 


; utility of the index. However, even if these two factors 


were optimum, the utility of the index is still affected by 


' the precision of the meaning of the indexing terms. 


' Clearly a system with perfect display and optimum 


weighting factors is useless to someone who does not 
understand the meaning of the words that it uses. 

Most information theorists seem to assume that mean- 
ing is perfectly precise; that is, they assume that con- 
cepts are lucidly embodied in documents, and that these 
concepts can be perceived and labeled with precision. To 
validate this assumption, they define or appoint a subject 


_ expert — someone whose duty it is to provide the stan- 


dard indexing information. Invoking the subject expert is 
logically equivalent to the assumption that concepts in 
documents can be precisely labeled, at least by someone. 

The concept of subject expert is unsatisfactory, both in 
practice and in theory. In practice, the available experts 
wish to restrict the area of their expertise to so narrow & 
speciality as to be nearly useless. Furthermore, no two 
subject experts agree, and neither is altogether compre- 
hensible to the questioner. The indexer usually resorts to 
using himself as the subject expert, and to doing his best 
to explain his point of view to the questioner. Sometimes 
the point of view of the indexer is so foreign to the ques- 
tioner that the latter will not attempt to use the index.? 

As a practical expedient, the subject expert is not a 
useful addition to an information system. As a theoretical 
expedient, an average or abstraction of subject experts is 
not helpful either. We can do no better than assert that 
a subject expert is one who fulfills the assumption given 
above. Enough is known about the process of perception 
to make it most unlikely that a real person can fulfill that 
assumption. 

In this paper, we adopt the point of view that a subject 
expert is neither necessary nor desirable, either as a part 
of an information system or as a standard of reference. 
We try to answer two questions: How precisely do ordi- 
nary people skilled in the subject perceive and label the 
concepts embodied in documents? Is this degree of pre- 
cision such as to pose a difficult problem so severe as to 
make optimum weighting factors for index terms a minor 
improvement? In subsequent papers, 1 plan to investi- 
gate the improvement of communication between ques- 
tioner and answerer as it is influenced by the content and 


' layout of the index itself, always bearing in mind the 


precision with which the meaning of the indexing terms 
is understood by the questioners. 

Hillman (2, 3) has shown that four factors enter into 
the definition of relevance: the question, the answer, a 


degree, and a corpus. The term “relevance” is ordinarily. 


applied to define the relationship between two things, for 


а This is the situation of many organic chemists in regard to alpha- 
betical indexes of chemical names. 


instance, а question and answer. Many think that rele- 
vance either exists or does not exist. However, in many 
instances we must also admit to degrees of relevance. Out 
of the many possible answers to a question it is possible to 
say that some have & great deal of relevance, some less 
relevance, and others have little or no relevance. Hillman, 


' in his paper, writes in terms of “concepts,” suggesting 


that their interpretation depends on the feld of knowl- 
edge (corpus) to which they are being applied. Hence, the 
field must be specified аз а part of our definition of the 


‘relevance existing between two things. I should like to 


suggest that this expanded definition of relevance should 
also be applied to our consideration of words and their 
meanings. This is an intuitively agreeable course of action, 
since we are all aware of how a given word may have a 
number of meanings and of how the specific meaning 
applied to the word depends on the field of knowledge in 
which it is being used. I propose that meaning can be 
defined as the relevance of a word to the concept that it 
labels. Therefore, if we ате to specify the meaning of a 
word, we must also specify the four parameters sug- 
gested by Hillman’s work: the word, the abstraction for 
which it is to stand, the degree of relevance that we wish 
to exist between the word and the abstraction, and the 
field of knowledge in which this word will be used. 

How then can these ideas aid us to better understand 
the task of indexing? By assigning a descriptor‘ to a 
document, the indexer asserts that the descriptor has a 
high degree of relevance to the contents of the document; 
that is, he asserts that the meaning of the descriptor is 
strongly associated with a concept embodied in the docu- 
ment, and that it is appropriate for the subject area of 
the document. Let us assume that the indexers assign the 
descriptors in the order of the degree of relevance to the 
concepts, or that they assign all of the descriptors that 
they believe have a high degree of relevance. Then the 
consistency with which a given degree of relevance is 
associated with a given descriptor-concept pair will reflect 
the precision of the association strengths. Hence, con- 
sistency of indexing serves as a measure of the precision 
of meaning.’ 

Through measuring the consistency with which a term 
is applied to a concept, we are able to assess whether or 
not its meaning is understood with precision. By having 
а number of abstracts indexed by a number of people, it 
is possible to discover the consistency with which a given 
indexing term was used and, hence, how well the meaning 
of the term was understood. 


3 Нагоја Wooster suggests that meaning be defined as the degree of 
&ynonymy connecting а word and the concept it labels. However, I 
think it is clearer to uso synonymy to пате. а relation between like 
things, e.g., one word has a degree of synonymy with another; and to 
use relevance to name а relation between unlike things, e.g., а word 
has a degree of relevance to a concept. 

*In this paper, descriptor is used ан а synonym for index term in 
accordance with popular usage, but at odds with its formal deflnitton. 

5 For another epproach to the measurement of the precision of mean- 
ing, see references 4 and 6. 
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у • ; Part Т. и Солашонеу with Free Choice 
‘of Descriptors 


: In Part I of the experiment, 15 indexers were asked to 


choose descriptors for 60 abstracts which had: been chosen- 


^ at random from a single abstract journal. The indexers 


were to select any words or phrases that they considered ` 
appropriate. They were not supplied with instructions for. 
` making the choice, a word list, or the definition. This. 


resulted in a list of descriptors that contained 1,050 differ- 
ent words and phrases that had been applied to the 50 


abstracts. Plural and singular forms of thé same word | ' 


were counted as а single descriptor. The diversity of the 
responses made it impossible to use these data to arrive 
at any estimate of the precision with which the descrip- 
. tors were used. However, several interesting -conclusions 
could be drawn from these data. Ап average of 3.6 de- 


acriptors was assigned by each indexer to each abstract. . 
However, let us remember that the depth of indexing ів . 


not the same as the number of index terms applied. Depth 
.of indexing is measured by- that proportion of concepts 
embodied in the document to which index terms are ap- 
_ plied. If a document is about two concepts only, the 
deepest indexing сап apply only two terms to it. Con- 
versely, if a UDC number six digits long is applied to a 
document that embodies a dozen concepts, only one index 
term is applied — not the six terms implicit in the six 
digits — and the indexing is shallow. To determine the 
absolute depth of indexing for the documents used here, 
an absolute judgment of the number of concepts em- 
· bodied in each document would be needed. Since we have 
_ avoided absolute judgment, we cannot make this calcula- 
tion. To estimate this number of concepts by taking the 
consensus of the indexers, we needed the opinion of each 
indexer regarding the generic relations among the terms 
he applied — the degree to which each term implies or 
includes every other term. Thesé opinions, rather difficult 
to form, were not determined. Consequently, we cannot 


:, estimate the depth of indexing and can only guess that it 


‘probably is not very deep. 

Of the 1,050 descriptors chosen, only 48% of hese were 
actual words or phrases that appeared in the abstract or 
' the title. The indexers apparently found the language 
used by the author in the abstract. inadequate for de- 
scribing the work performed. As might be expected, a 
greater number -of cac were applied to longer 
abstracts. 


CHANCE OF RETRIEVAL | 


The descriptors assigned by the indexers to the subjects 
- covered in an abstract can be considered to be the search 
program they would first formulate if they were attempt- 
ing to find information on those subjects. The probability 
that the search devised by an average indexer would 
contain: at least one descriptor in common with those 


2 
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chosen for & given abstract by all other indexers partici- Р 


- pating is‘given in Equation 1, where C, is the chance of 


Equation 1 
іст 2 
| (Xi) 2<х sn 
C; (%)= izl _ > X ОО 
iem 
2. Xij Isxsn 
i=] 


Equarton 1. Chance of Retrieval for Abstract j, C}. 


retrieval for abstract j, expressed in per cent, Ху is the 
number of times the descriptor i was applied to the ab- ` 
stract j; n is the number of indexers; and m is the num- 
ber of descriptors. This new measure of the chance for 
retrieval is necessary because existing measures of effi- 
ciency include a term which requires a judgment of the’ 
pertinence of the descriptor choice. No such judgment. 
was made in dealing with these data. | 

.If each indexer applies different descriptors to an ађ- 
stract, the chance of retrieval is zero; if each indexer 


" applies the вате descriptors, the chance of retrieval is 
‚100%. Nearly 62% of the descriptors were used only ' 


once; consequently, the chance of retrieval was low, vary- 
ing from 1.5% to 12%, averaging 6.5%. This I consider 
to be а good estimate of the. success of any initial search · 
if it is made by а searcher familiar with, but not knowl-'' 


edgeable in, the special field of the search and if the  : 


searcher is not given indexing aids or the advice of sub- 


` ject experts. The chance of retrieval showed no согтеја-. ^ 


tion with the number of words in the title and in the 


` text of the. abstract. Of the approximately 1,050 de- 


seriptors used, 1%, ог 10 words, accounted for 3096 of | 
the descriptor-abstract pairs and 2.2% of the descriptors . 
accounted for half of the pairs. yo 
This suggests that a much smaller number of descrip- 
tors would be almost as effective for describing these ab- 
stracts as the 1,050 actually used. The value of the chance 


of retrieval for each abstract was plotted versus the | 


number of descriptors chosen for the abstract. The least- 
squares straight line showed а negative correlation, sig- 
nificant at 9996 level: I conclude that, under the condition 
of free choice of descriptors, the greater the number of 
descriptors applied, the more difficult is the retrieval. 
Some control of the indexing language is obviously the 
next step in our attempt to eliminate all the variables and 
measure only the precision of the meaning of the words 
— the precision: with which indexers and questioners : 
knowledgeable in the subject can perceive and label con- 
cepts embodied in documents. One possible mode of 
control would be to restrict the indexing language to that. . 
appearing in the documents. There are several objections | 
to this procedure. First, it is not drastic enough. When one 


attempts to eliminate an undesired variable from a mea- 
surement, the best initial strategy is to give the undesired 
variable two extreme values. In this case, wishing to 
measure the precision of meaning independently of choice 
of language, we first give great freedom, then highly re- 
strict choice of language. Free to use the language of the 
documents, the indexers can still use nearly half of the 
deseriptors. Ав we shall see, а cut to 10% is not too 
drastic. 

А second objection to the procedure of restricting the 
language of deseription to that of the documents arises 
from the particular document set used. Although many 
of the abstracts were written by the original authors, 
many were not. The language in the latter is not that of 
the author. Thus, we cannot restriet the indexer to the 
language used by the author. Furthermore, what we wish 
to estimate is the precision with which the indexer under- 
stands the meaning of the words, not the facility with 
which he pieks out the author's words. Іп forcing the 
indexer to use апу language but his own, we introduce 
an additional uncontrolled variable into our measure- 
ment. Finally, the author’s language is not an optimum 
choice to reach conditions of maximum precision in mean- 
ing, because there is no reason. to believe that authors use 
words more precisely than indexers do. 

It seems best to begin with the list of freely chosen 
descriptors, for these represent the natural expression of 
the group of indexers, if not of each individual indexer. 
In restricting the list, we can make no "intelligent" choice, 
for to do во 18 to reintroduce the subject expert, whose 
influence we wish to avoid. The list should, then, be 
drastically reduced in size in an arbitrary way. The list 
was cut to one-tenth, using a random selection so ar- 
ranged that terms used by many indexers are more likely 
to be retained. Chosen in this way, the final list resembles 
more closely the list used by each individual indexer than 
would be the case if the list were culled at random with- 
out consideration of the frequency of use of the terms. 

To examine the effect of reducing the number of de- 
scriptors on the chances of retrieval, & small set of de- 
scriptors was chosen from the total list of 1,050. Певегір- 
tors were chosen for this set by first giving a weight to 
each descriptor — a weight representing, not its probable 
utility in retrieval, but the frequency with which it was 
applied by the indexers. Numbers from 1 to 2,180 were 
assigned to each descriptor each time it was applied, so 
that a given descriptor that was applied to 30 abstracts 
might receive the numbers 1-30 or 70-99, whereas a 
descriptor that was applied only once would be given 
only a single number. Numbers were then taken from a 
random number table (6) and matched against the num- 
bers assigned to the descriptors until a set of 100 deserip- 
tors was chosen. This set of descriptors was then com- 
pared with our original descriptor-abstract data to find 
how many times each descriptor in the set of 100 had been 
applied to the 50 abstracts by the original 15 indexers. 
We then substituted this new frequency data in Equation 
1 to find how restricting the number of descriptors had 


affected the chance of retrieval of each abstract. This 
chance of retrieval is the probability that the first search 
program would succeed if only the descriptors in the 
selected list of 100 were used. This chance varied from 
0 to 100%, averaging 36%. When the selected vocabulary 
was used to calculate the chance of retrieval, the correla- 
tion with the number of descriptors applied rose to a 
small positive value, indicating no correlation. That is, 
the disadvantage of assigning many descriptors to a 
document, without control of choice, was overcome by 
restricting the choice of descriptors used in searching 
(Table 1). 


• Part II. Indexing Consistency with Restricted 
Choice of Descriptors 


In Part II of the experiment, the same 50 abstracts, 
rearranged in a random fashion, were sent to 9 indexers, 
all of whom were among the original 15 indexers. These 
indexers were now supplied with the randomly chosen 
list of 100 descriptors and requested to apply these to the 
abstracts. For several of the abstracts, no adequate de- 
scriptors appeared in the list; consequently, the subjects 
were asked whether or not they considered the descriptors 
they had chosen for each abstract to constitute an ade- 
quate description. This opinion showed no correlation 
with the average number of descriptors chosen, and none 


with the chance of retrieval. 


Restricting the choice of descriptors results in a marked 
increase in the consistency (7, 8,9) with which they are 
applied. In Part I, no descriptor was applied to any 
abstract by all of the indexers, whereas in Part II, 19 of 
the descriptors were applied by all of the indexers. How- 
ever, of these 19 only 6 were applied with perfect con- 
sistency; that is, they were either applied 9 times or not 
at all. The other 13 descriptors were applied consistently 
to certain abstracts by all indexers but inconsistently to 
other abstracts. Of the 100 descriptors, 15 describe con- 


Tasim 1. Relation of the number of descriptors used for 
retrieval to retrieval chance and to the correlation of re- 
trieval chance with the number of descriptors assigned to 








an abstract. 

Numberof Chance of Retrieval, % Correlation 
Descriptors Used Min. Max. Aver. ZValue Significant 
Part I 1000 15 12 65 —051 Yes 

100 0 100 36 +004 No 

Part II 100 9.8 56 22 —0.78 Yes 
50 0 67 33 —0.52 Yes 

25 0 100 48 —0.35 Yes 

12 0 100 45 —0.13 No 


6 0 100 23 4-027 No 
Part Ill 45 84 15 12 -+0.03 No 


22 14 18 16 4-047 No 
11 19 91 37 +0.27 No 
6 37 44 41 --019 No 
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nil 


'geleeted. Each abstract was one about which there had 


| серін: that were ‘unknown only a few years ago. Five of 
these new words, or 33% of them, were among the 19 
Most precise descriptors, while only 16% ‘of the older 


words were used precisely. It would be interesting to 


know if new concepts are understood more precisely than | 
'older ones as suggested by these data. ` 


Twelve of the 100 descriptors on the list were used in 


34% of the descriptor-abstract pairs. Of these often-used . 
: descriptors, only 1 (or 75%) was new. The other 11. 
_ descriptors, or 13% of the total, were older words. This 
suggests that descriptors for older concepts tend to be 

` used more frequently. 

The chance of retrieval was calculated for each abstract: 
· by means.of Equation 1. It varied from 9.8% to 56%, 
‘averaging 22%, a substantial improvement over Part I.' 
The correlation of the chance of retrieval with the number , 


of descriptors assigned to the abstract was again strong 
and negative. Smaller sets of descriptors. were chosen from 


the list of 100 by using the weighting method described _ 


in Part I. Sets of 50, 25, 12, and 6 descriptors’ were 
chosen. Each set was then used to calculate the chance 
of retrieval for each abstract with the results shown in 
Table 1, and graphically in Fig. 1. Again, restricting the 
number of descriptors excludes the possibility of retriev- 


‘-ing some of the abstracts, i.e., the chance of retrieval is 
`0, but it increases the average chance of retrieval for the 


sets of 50 and 25 descriptors (dashed curve). However, 
with the sets of 12 and 6 descriptors, more and more ab- 


stracts are irretrievable and the average chance of re-. 


trieval diminishes. The correlation between chance of 
retrieval for a particular ‘abstract and the number of 


. descriptors applied to the abstract is strongly negative 
for the largest three sets of descriptors (100, 50, 25) and 


insignificant for the smallest two sets. 


• Part. m. Indexing Consistency with Many In- 
-dexers 


For the third part of the experiment 21 abstracts were 


+ 
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' been substantial disagreement among the indexers as to 


whether descriptors assigned from the list of 100 ade 
quately described the abstract or not. From these 21, the 
abstracts were chosen that required the use of “new” 
descriptors; i.e., terms that have been added to the tech- | 
nical vocabulary during the last few years. This elimi-, 


‘nated all but 8, and from these 5 were selected at random. 
` The descriptor set made up for these abstracts consisted 


of all the descriptors that had’ been assigned to these 


' abstracts by indexers in Part II. This required 28 de-. ' 


scriptors. Seventeen additional descriptors were then 


. chosen from the Part II set at random by using the | 


weighting procedure already described to compile а list 
of 45 descriptors. The abstracts and descriptors were sent р 
to several hundred indexers with backgrounds similar to 


' those of the indexers used in the first two parts of this 
. experiment. Each indexer was asked to assign descriptors 


from the list of 45 to each of the 5 abstracts and to indi- 


- cate whether or not they considered the deseriptors- 


chosen adequately described the abstracts. 
Three hundred and twenty-three indexers responded. 


` There’ were several differences between the results of 
Part П and those of Part III and the conclusions that ``. 


could be drawn. from them. The chance of retrieval in 
Part III was lower than that in Part П; the larger num-- 
ber of indexers had а greater difference of opinion, lower- . 


ing the chance of retrieval. The correlation of chance of ~^ 


retrieval with the number of descriptors assigned per 
abstract is insignificant — probably a consequence of the 


‘drastic reduction in the number of abstracts. The 12 
' descriptors most; often used accounted for 59.5% of the’ 


total number, or nearly twice the fraction of Part II. Of 


‘these, 6 were new and 4 showed good precision. The dif- ' 


ference in usage of new words and the tendency to use 
new words less precisely than old words fail to appear 
in Part ПІ, probably because the particular abstracts : 
were chosen to include frequent use of new words. 

No descriptor was used in Part ПІ with perfect preci- 


' sion. The best four descriptors (lead monoxide, modula- 


tion transfer function, phase‘ modulation, nuclear reac- 
tors) were used by 96%, 93%, 88%; and 85% of the 
indexers, respectively, for one of the 5 abstracts, and 
none of the indexers applied any of these terms to more 
than a single abstract. : 

‘The typical graph shown in Fig. 2 shows how. “lead 
monoxide” was handled in each of the three parts of this 
experiment. As we have already noted, the descriptor was 


-applied in Part Ш to one of the 5 abstracts by 96% of 


the indexers and was not applied to any of the other 4. 
abstracts by any of the 323 indexers. This is shown by 
the graph labeled “Part III" in Fig. 2. In Part II in 
which 9 indexers were asked to index.50 abstracts using 
a list of 100 terms,."lead monoxide" was applied to one . 
abstract by 67% of the indexers and to a second by 78%. 
None of the indexers applied this term to any of the | 
other 48 abstracts. In Part I in which 15 indexers were 
asked to index thé same 50 abstracts, but without the aid _ 


of а descriptor list, the term was applied by 5396. of the К 




















007—711] Pot ос ШІН тан та кане a 
PART I PART II PART Ш 
| 
INDEXERS i 
APPLYING } 
THE 50r j i 
DESCRIPTOR | | 
i | 
/ i 
о Li d | sized 
46 48 50 46 48 50 Ц 3 5 
RANK 


Fic. 2. Graphs illustrating the precision of meaning of 
“lead monoxide." 


indexers to & single abstract and it was not applied to any 
other abstract by any of the indexers. 

At this point, it is interesting to observe the behavior 
of the indexers with respect to some of the other terms. 
Fig. 3 shows how the term “data processing” was applied. 
Note the perfect precision in the graph of Part II with 
which this descriptor was applied. However, in Part I the 
term was applied by far fewer indexers when they were 
required to supply the term rather than to assign the 
term from a list. In Part III, for some reason, the term 
was applied somewhat less frequently to the one abstract 
(1.e., less frequently than in Part IT), but more frequently 
to two of the other abstracts. Since the population of 
indexers is so small in Part II, the results in Part ПІ 
probably express more accurately the precision with 
which this term is applied or understood. 

An objection might be raised that, although the mean- 
ing of a word may be perfectly understood by an indexer, 
he might fail to apply the word because he considers it 
useless for retrieval of the particular abstract. This objec- 
tion is specious: the distinction is too fine to influence 
the results of this experiment. That is, for the purposes 
of this experiment, the following statement was taken as 
a tautology: The word that stands for a concept is useful 
in retrieving an abstract treating that concept. The terms 
“lead monoxide” and “data processing” can be used to 
illustrate this point of view. If a given term is applied to 
a specific abstract by a large number of indexers, it is 
fair to say that those who do not apply the term do not 
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Fic, 3. Graphs illustrating the precision of meaning of 
“data processing.” 
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Fic. 4. Graphs illustrating the precision of meaning of 
“electrophotographic plates.” 


fully understand its meaning. Hence, we can say that the 
consistency with which a term is applied by a large num- 
ber of indexers is a good measure of how well that term 
is actually understood. 

Next, an attempt was made to discover whether or 
not a group of indexers who used one term with precision 
used other terms with equal precision. To do this, the 
term “electrophotographic plates” was selected because 
it had been used throughout the experiment with average 
precision. Fig. 4 shows how this term was applied to a 
single abstract by 88% of the indexers. We now reject the 
12% of the indexer population that failed to applv the 
term to abstract 5, and an additional 34% minority who 
applied it to two other abstracts. By examining how the 
remaining 54% (174 indexers) applied other terms, we 
can gain some understanding of whether or not a popula- 
tion that is consistent in its use of a given term is also 
consistent in its use of other terms. Fig. 5 shows how the 
term “conductivity” was handled in the three phases of 
the experiment. The right-hand graph shows how the 
selected group of 174 indexers applied “conductivity” to 
the 5 abstracts used in Part III. We note that the selected 
group, who had been consistent in the application of the 
term “electrophotographic plates,” is not consistent in its 
application of the term “conductivity.” If this same pro- 
cedure is applied to other pairs of terms, we always find 
this behavior, 1.е., that a given term is applied consist- 
ently by & group of indexers is no guarantee that the 
group will apply some other term with equal consistency. 
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Fic. Б. Graphs illustrating the precision of meaning of 
"conductivity." The graph on the extreme right shows the 
results in Part IH, with all assignments deleted for indexers 

who disagree on “electrophotographic plates." 
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| Summary 


In Part I of the experiment, 15 indexers chose descrip- 
‚ tors for 50 abstracts with no restrictions on the choice of 
descriptors or-on the language with which they were ex- 
pressed. Many descriptors were chosen, although only a 


. moderate depth of indexing (3.6 descriptors per abstract) . 


resulted. Only about half of the descriptors matched 
words or phrases of the abstracts. A new measure of 
retrieval efficiency was needed because no judgment of 
-correctness of the indexing was made. The chance that at 
least one descriptor applied by one indexer would match 
the descriptors applied by all the other indexers, the 
chance of retrieval, was calculated. For the 50 abstracts 
the average chance of retrieval is 6.5%. The average 
chance of retrieval rose to 36% (nearly віх times as high) 
when calculated using data from a selected list of 100 
terms. In Part I, even though the field of knowledge 


(corpus) and the concepts embodied in the abstracts were . 


controlled, the consistency of indexing was low. The va- 
riety of expressions used for an identical or closely related 
concept frustrated the attempt to relate precision of 
meaning to indexing consistency. - 

In Part П, 9 of the same indexers applied descriptors 
.to the same 50 abstracts, using only the 100 selected 
terms. The consistency of application increased markedly, 
and 6 of the terms were used with perfect precision. The 
chance of retrieval averaged 2295. When an even smaller 
selected list of descriptors (25) was used, the chance of 
.retrieval rose to а maximum of 48%. 

: Most descriptors were used imprecisely. Since meaning 
is defined in terms of the relevance of a word to the 
concept it labels, descriptors will be imprecisely applied 
if any one of four factors (the field of knowledge, the 
concepts, the words, or the degree of relevance) is not 
controlled. All indexers must have a common understand- 
ing of the concepts used in a given corpus of knowledge 
and the words to be associated with these concepts. They 
must also have a common understanding of the degree of 
association (relevance) that exists between a word and 
the concept for which it stands. Analyses of the results 
show that three of these factors (field of knowledge, the 
concepts, and the words) were adequately controlled 
under the conditions of this experiment. However, be- 


cause only 9 indexers were used, the degree of relevance, 


cannot be measured with a high degree of confidence. 
Hence, & larger group of indexers was used in Part III 
in order to demonstrate more fully the connection be- 
tween precision of meaning and consistency of mdexing. 

In Part III of the experiment, 323 indexers applied 45 
selected descriptors to 5 abstracts. The consistency of 
. application decreased-and the chance of retrieval aver- 
aged 16%. Using fewer descriptors from this list raised 
the chance of retrieval. Graphs, in which the fraction of 
indexers applying 2 given descriptor to the abstract was 
plotted versus the abstracts and ranked so that the high- 


sl OM ranking abstract had the greatest fraction of indexers 
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О the particular descriptor, illustrate the precision Ў 
with which-the descriptor is used. None of the descriptors . 
in Part III was used with perfect precision. Furthermore, ^ 
it was shown that precision in the use of one tefm did ' 
not imply that the same indexer would use other terms 

with precision. | 


In the experiment, the field of једе was БИЕ. e 


апа well defined, the terms were kept constant, and the 


. concepts embodied in the abstracts were kept constant. 


Nonetheless, perfect consistency was not obtained. Obvi- 
ously, в fourth variable is present. I conclude that mean-, 


ing can be defined as the relevance of a word to the | 


concept it labels, and that the degree of relevance, ав it is 


. understood by the various indexers, is the most important 


variable accounting for these results. А 
` It is clear that the meaning of words is not understood 


precisely, even when the words labeling scientific concepts 


are used by competent scientists in their own field of 
knowledge. Any aid (thesaurus) or procedure( multiple 
indexing, personal contact) that helps to increase the 


.precision of meaning among а partieular group ean be 


expected to increase the effectiveness. with which they 
exchange information. To make àn excellent index means 
must be found that will allow good communication by 


- means of words which are inherently imprecise. 
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Brief Communication 


А Matrix for Evaluating Information 
System Operation: 


There are so many sources of information available and 
so many types of inquiries that it is difficult to talk explic- 
Шу about information exchange іп a research field. The 
simple technique presented here is the approach we used to 
examine information exchange relating to civil defense re- 
search, It may prove useful to others in examining informa- 


tion systems. 


Information Users 


The matrix shown as Fig. 1 illustrates present informa- 
tion transfer patterns. For illustration, five users of civil 
Beer information are listed down the left side of the 

gure. 


Office of Civil Defense (OCD) 

State and Local Civil Defense 

Research Contractors 

Delegate Government Agencies (Those agencies with 
specific civil defense responsibilities as delegated by 
Executive Order of the President.) 

5. Public and Other Users 


а ee 


Forms of Information 


Nine forms of information are listed beside each of the 
five users. These forms show that information is obtained in 
many ways. 

A user can know what information exists by reading 
Review Articles, by regularly scanning Accession Lists, or 
by requesting Btbhographies on a particular subject. 

The user may also be led to information by Discussions 
with individuals knowledgeable in the field, particularly 
those individuals with research Work in Progress. News 
Releases may also notify users of available information. 

The information itself may be contained adequately in 
Abstracts or it may be necessary to read the Reports and 
Books themselves. The information also may be obtained 
by attendance at Formal Meetings where papers are pre- 
sented, 

The nine terms in italics are referred to in Fig. 1 as the 
forms of information available to each of the five users. 
These forms correspond to the rows in the information ex- 
change matrix. 


The Sources of Information 


_ Nineteen sources or types of sources of information are 
listed across the top of Fig. 1. Two of these sources are 


1 This article ів based on research sponsored by the Office of Civil 
Defense, Department of the Army, under contract OCD-P&-64—58. At 
the time this research was conducted Мт. Jenkins and Mr. Herzog were 
operations research analysts with the Research Triangle Institute, Dur- 
ham, North Carolina, 


hypothetical; ie., they don't exist at present, but may be 
useful, These sources form the columns of the information 
exchange matrix. 


Research Directorate—OCD 

Other Personnel—OCD 

Publications—OCD 

Information Center—OCD (Hypothetical Services) 

Depository Libraries—OCD (Hypothetical Services) 

Contractors—OCD 

Army Library—Pentagon 

Defense Documentation Center 

Delegate Agencies (Including OEP) 

10. Government Printing Office 

11. Clearinghouse for Federal Scientific and Technical 
Information 

12. Library of Congress 

18. Atomic Energy Commission 

14. NIH, NASA, NSF, etc. 

15. Science Information Exchange 

16. National Academy of Science (NRC) 

17. Centers for Analysis of Scientific and Technical In- 
formation 

18. Professional Societies and Journals 

19. News Media 


SO оомо Өз н» Coto PA 


Explanation of the Matrix in Fig. 1 


The present information transfer patterns are evaluated 
in the matrix presentation of Fig. 1. 

Our evaluation of information exchange is placed in the 
rectangle formed by the intersection of & row and & column. 
For instance, row one described the places & user in the 
Office of Civil Defense presently searches for а report copy: 
chiefly, the OCD Research Directorate, the Army Library, 
or the Defense Documentation Center. Occasionally this 
user requests the document from & Contractor, & Delegate 
Agency, the Government Printing Office, the Federal Clear- 
inghouse, the Atomic Energy Commission, or the National 
Institutes of Health. 

In other words, when а solid black rectangle falls at the 
intersection of а user row and a source column, it means that 
the user normally looks to that source for information. In 
6the illustration, a user in the Осе ој Civil Defense when 
looking for reports normally looks to the Army Library 
(among others. for copies, Therefore, & black rectangle fills 
ihe space аф the intersection of row one, (Office ot Civil 
Defense—Reports & Books) and column seven (Army Li- 
brary—Pentagon). 

When a black dot is placed at an intersection, it means 
that some limited exchange occurs. The limitation may 


-occur because the source service is limited or because the 


users make inadequate use of the available service. 

When no black mark is placed at an intersection, Jittle 
or no information exchange now occurs between the particu- 
lar user and source combination. 


Explanation of the Matrix in Fig. 2 


Fig. 2 illustrates the exchange which might occur under 
ideal conditions. It recognizes that some services are not 
being used because they are not and will not be appropriate. 
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Fig. 1. Present transfer patterns, civil defense information system. 


Clear rectangle =. Little or no information is presently provided to the user by the corresponding 
source. ` | 


Black dot = There is limited to fair exchange of information at present, 
Black rectangle —  'lhere is satisfactory or good exchange of information at present. 


D 


‘ 
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| | Fic. 2. Potential Бала patterns, civil дн information System. 


А ' Clear rectangle = There is little or no potential exchange of information. 
| Black dot = There is limited to, fair potential for information exchange in the fature: 
| Black rectangle == ‘There is good potential for information exchange in the future. 
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. Others are not being used, but should; e.g., Science Infor- 
mation Exchange. 

If Fig. 1 is printed in color on а transparent overlay and 
placed оп top of Fig. 2 the opportunities for improving the 
information exchange patterns Pecore immediately obvious, 
The black rectangles and dots of the potential exchange that 
^ show through the overlay are points where the potential is 
not being met. The overlay provides & simple method of 
visualizing the overall information exchange pattern and 
for identifying existing gaps in the information transfer 
. process. 


| Conclusions 


Many scientific and technieal information services аге 
^'&va&ilable from various government and-private agencies. 
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i 


Consequently, numerous formal and informal channels of, 

information exchange exist which should be considered be- 

fore adding new information services in a research field. 
These existing channels are not easy to visualize. Ме. 


found the matrix approach useful in studying the overall : 


exchange of information, in identifying new systems which 
might be useful, and in identifying existing services that can 


be used more efficiently. 


J. Enwarp JENKINS? and 
' WiniaM T. Herzog 3 Р 

Research Triangle Institute 

Durham, North Carolina 


~ ~ 


3Presently: with Department of Electrical Engineering, University of 
Edinburgh, Edinburgh, Scotland. 

з Presently Assistant Professor, School of Public Health, University of 
North Caroling, Chapel Hill, North Osrolina. 


nting of Venn diagrams, Mr. Sharp (1) asserts, "It is, 
unfortunately, impossible to use simple diagrams for more 
than 4 terms.” 

He may well be correct about their use’ or their. utility. 
Nevertheless, their construction is possible for any number 
of jvariables, using simply connected areas that are at least 
conceptually simple also. 

any methods have been described and reinvented since 
ы. 1, most of them using comb-like polygons. This indeed 
ab suggested by Venn himself. 

The D-term diagram results from placing a U-shaped 
polygon on the well-known 4-term set of rectangles. 
method сап be extended indefinitely. See, for instanée, 


Anderson and Cleaver (2), and also Martin Gardner’s book . 


(2), Тһе latter is fully illustrated апа has в good bibliog- 
тару, this ЫЗ at ge сй to remind those who need to ђе 


| reminded that Euler diagrams are not Venn diagrams. | 
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Нетпет and Company 
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Dear Sir: А 
In the July 1965 issue of American Documentation. (Vol- 
ume! 16, No. 18), on page 237, in the paragraph "Inter- 
? there are some strange statements about ICSU 
«АЗ. and FID. on which I would like to draw your atten- 
tion, since you are among the officers of. the “American 
Documentation Institute." 
(1) We are not considering to extend our activities to 
include Nuclear Energy and Geology. 
` (2) As d may see, there is some bad muddle about the 
Study, on. Abstracting Periodicals in Physics. - 
hwhat is "Mid d in the second paragraph "The Inter- 





national Federation . . . . Conference of Biological Editors" 
are not FID. recommendations, but recommendations of 
the CO final Technical Wording Group on Scientific 


Documentation which met in Paris іп” March 1964. The 


report of this Working Group is document UNESCO/NS/. | 
Doe. ТУ О 2 


1. Co-ordination of the ‘recommendations of. the three 


t 





- — Тепегв to the Editor 


Working Parties on Scientific Documentation — Ways and 
Means of implementing the above recommendations. 


. The recommendations of the three Working Parties in 
scientific publications (Philadelphia, September 1963), 
automatic documentation (Moscow, November 1963) and 
scientific translation and terminology (Rome, January 
1964) were examined and generally supported and ap- 
proved. In addition, the following comments and sugges- 
tions were agreed: · 
4а hoc Sub-committee to study and report on methods o 
imary scientific publication. It was stressed that it woul 
e advisable to have on this Sub-committee representa- 
tives.of editors, scientific documentalista and librarians, 
specialists in computation and users (scientists and en- 
gineers). Names of possible members of this Sub-Commit- 
tee were: proposed but it was finally decided that the 
‘members of the Working Group should send to the Вес- . 
retary.a short list of names of individuals and organiza-_ 
' tions that might be represented on the Sub-committee. It 
was stressed also that the terms of reference should in- 
‚ elude not only primary publications but also their relation . 
to Secondary publications. 
It was considered that the study proposed by the ICSU 
Abstracting Board on abstracting periodicals in physics — 
. “Bulletin Signalétique”, "Physikalische Berichte", “Ве- 
ferativny Zhurnal"— would provide а good source of 
information for the Sub-eommittee. It was suggested that 


А 


' Unesco should therefore assist the project, which will com- | 


-plement & similar survey that is being carried out in the 
sa on Physics Abstracts. 


I think it would be necessary that American Documenta- ri 
tion publishes an errata about these pointe. We would like 
that the statement on the ICSU A.B. be as follows: 


The ICSU AB. continued its general activities to improve 
the dissemination and quality of scientific information. 
The service of exchange of proof copies or advance copies 
for Member Services of the Board (the main abstracting 
periodicals all over the world covering Physics, Chemistry 
and Biology) has been reorganized and enlarged. The 
ICSU A.B. is also doing some precise studies, among which 
may be quoted: 

— а detailed study of the main primary perpen in 

Physics, Chemistry and: Biology 

--а complete statistical study “of the 1964 issues of. 

Physics Abstracts, Physikalische Berichte and Bulletin 

Signalétique (Physics sections). 


Resulta of these studies will be published in the course 
of 1966. 


. J. Poven 
Conseil International des 
Unions Scientifiques 
Же Paris 
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The reviews tn this issue were written as one of the regular . 


assignments in the Modern Information Systems course at 
the Columbia School of Library Service. They are. of books 
important in the field which have not previously been re- 
viewed in’ American Documentation, and are included here 
primarily as reviews — but also as some kind of reflection of 


‘education in the field. Student reviews are solicited from 


other institutions as well. Interested prospective reviewers, 


practitioners, and students are again urged to write to the. 


Reviews Editor, indicating their special areas of interest. 


Textbook ‘on Mechanized Information Re- 
trieval. 1962. Allen Kent. Interscience, New York. 268 pp. 


` In^ his preface Kent delineates five purposes to be served 
by his textbook: (1) to serve as a textbook for fifth year 
graduate study in library school; (2) to serve the admin- 
istrator, scientist, and practicing librarian i in acquiring some 
basic understanding of a field that has progressed so fast 
that it may have eluded their ken; (3) to serve the de 


: veloper of a retrieval system, whether for individual use or 


for large-scale exploitation, who cannot obtain unbiased ad- 
vice as to choice of procedures and equipment; (4) to serve 
as & first guide for those who wish to compare their retrieval 
system to others that may be used; (5) to serve com- 
mercial interests who, in thé long run, will benefit from an 


: educated clientele. 


This is а very large order and one that has been only 
pens fulfilled. Basically, this is a textbook for beginners. 
can help students, administrators, scientists, librarians, 
and those interested in developing a retrieval system, pro- 
vided they are beginners in the field of information retrieval. 
In the fourth purpose as outlined, Kent himself qualifies the 
‘use of his book as a first guide for those who wish to com- 
eid iheir systems with others. To вау that the book will 
nefi& commercial interests through educating readers to 
the uses of equipment is stretching a point, but may be, true 
if, indeed, more li librarians are inspired to apply some of the 
knowledge gain 
Тһе book is. is лдей into two sections. Section I contains 
eight chapters of text illustrations. Section П is made up of 
supplementary’ reading lists, classroom exercises, field trips, 


‚ Suggestions for-the use of audiovisual material, and a sample 


а 

| -After а apter of ақылын (corresponding to 
Bourne’s one’ ature of the Problem”) describing the in- 
formation. problem and what the book is about, Kent de- 


votes a chapter to “Physical Tools.” This chapter provides 
a step-by-step’ discussion of the unit operations in machine 
literature searching, and it is from: this unit operation ap- 
proach that Kent.writes his book. А brief list of the head- 
ings'in this chapter will show how simply and clearly. the 


materia] has been handled. Unit operations include: апају- 


sis; control of terminology; recording results of analysis in 


бов searchable medium; storage of source documents, extracts 
abstracts, bibliographic references; question analysis and. 


development of search stra ; conducting the séarch; de- 
livery of results. Each of these headings is further 'gub- 
divided and discussed. 

With this material there are excellent illustrations of the 
various tools. Using this chapter as a basis, Kent then de- 
votes Chapter 3 to more detailed discussion of the “Prin- 


. ciples of Analysis” (including indexing) and Chapters 4 


‘and 5 to the “Principles of Searching" and the “Manipula- 


tion of Searching Devices" Kent's exposition in Chapter 3 
of indexing by means of the two techniques, “word index- 
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ing" and "controlled indexing," seems more meaningful 


than Bourne’s chapter on “Classification and таса п 


This may only prove that Kent’s book is for the novice, 


- Bourne's chapter goes.into far more detail and gives i ora 


‘field of information han 


‘examples of indexing systems. An example of Kent's ele- 


mentary approach can'be found in his step-by-step detailed 
discussion of а method of producing в concordance, 

type of explanation is not found in Bourne, who assumes 
the reader has the basic: knowledge and, if he does not, 
Bourne gives him bibliographic citations, eg., Wisby, R. 
“Concordance Making by Electronic Computer: Some Ex- 
periences with the ‘ 

view, 57, (2): 161-172 (April 1962). 

In Chapters 4 and 6 of Kent, the strategies of БАТ, 
and how the various non-conventional systems are used to 
satisfy these а ше, are. especially illuminating, These 
logical conce are not discussed by Bourne. noted 
earlier; it is the eden Dy Mp explanations in Kent that are 
belpful to one who is ignorant, ott the field. For exam 
obtain the logical eet ae of logi 
with marginal punched ent writes: 
sorted for aspect A; pene not selected are ei oe пате 


B. Those selected after each sort are sorted for the presence. 


of aspect C; those selected are the first fraction of the 


` response. Those not selected are sorted for aspect D; those 


selected are the second fraction i the response.’ 

Chapter 6, “Words, Language, Meaning and Retrieval 
Systems,” is ‘weaker and somew i repetitious of the infor- 
mation on indexing already presented. Chapter 7, “Codes 
and Notations," cannot be compared with Bourne's similar 
chapter in erudition, : thoro ess, and scope. It is good for 
basic simple principles. 16 ifferentiates in simple language 
between superimposed and nonsu eae ots Teco 
notations,” “numeric versus alphabetic notations,” “ 
field versus free field,” etc. 

“System Design Criteria, » Chapter 8, offers factors, such 
as one’s objectives, functions, performance uirements, 
and environmental variables, to consider when designing & 
System. It is & very general guide offering no comparison 
of what various systems ean do nor how much they might 
cost. 


Part II, devoted to "Exercises," seems quite elementary | 


when compared with the рарег required for some intro- 
duetory courses. One which might be cited is to bavo the 
class sit in a circle and whisper one to another to note how 
information is changed during transmission. A suggestion in 
Part II that might be worthwhile, however, was to show & 
film on machine literature searching. This might be & sup- 
plement to & trip to IBM, as it would be geared to library- 
type operations. Films d were those produced by 
General Electric Company; Smith, Kline and French; 
or the Ármed Services Technical Information Agency. 
The ‘bibliography in Kent’s book is most parochial — 
85% of the references are to articles or books written by. 
Kent or by his associates. It could not be used for deeper 
2) -of the subject as could Bourne's excellent bibliog- 
тарпу. 


Both an author and subject index are provided. This Қ 


Tezibook on Mechanized Information Retrieval could 

rofitably be read before Bourne as an introduction to the 
. It offers simple definitions and 
рр procedures. . 
are good. However, the unit operation approach gives no 
picture of systems,as а whole, such as is found in Bourne, 


nor does it adequately consider system applications ог” 


lener Genesis," Modern Language Re-. 


of 
ed. 


le, to . 
ee LUE ОЧОР) 


ustrations, charts, and samples · 


evaluations. Kent-is especially weak in infotmation regard- : 


- ing computers. For a more scholarly and complete reference 
work with excellent bibliography one must use Bourne. For 
anyone developing a retrieval gystem to rely solely on Kent 
would be foolhardy. А library school course completely de- 
pendent on Kent without the addition of Bourne is not 
exploring information systems in depth and is not using the 
Бавіс tool in the field. 

AUDREY RUBIN 


4/66-2R Towards Information Retrieval. 1961. Robert 
А. Fairthorne. Butterworths, London. 211 pp. 


Towards Information Retrieval is & collection of papers 
written over a thirteen-year period. While no attempt has 
been made to add textual material which would unify them, 
the papers do form & composite picture of some of the 
theoretical aspects of information retrieval, bringing to the 
render various facets of a common theme. 

, Robert A. Fairthorne is а noted figure in а documenta- 
tion, although perhaps better known in England than in 
the United States. The earliest paper presented in this col- 
Jection dates from 1947 indicating P long concern with the 
field. Some thirty-five years of the author's career were 
spent at the Royal Aircraft Establishment, where he was 
consultant at the time of this publication. During the first 
twenty years of his career he applied his mathematical 
training to a wide variety of technical problems. During the 
next fifteen years he was also much involved with the or- 
ganization of the library at the Royal Aircraft Establish- 
ment and the practical and theoretical problems involved 


became his major concern. He is now with Herner and Com- . 


pany in the United States. 

In his article “Identifying Key Contributions in Informa- 
tion Science” (Ат Doc 15: 289-295, Oct. 1964), Carlos A. 
Cuadra singles out this volume as a major text in the field. 
Fairthorne’s name also appears in Cuadra’s table of fre- 
quently cited authors, derived from a count of entries in 
bibliographies in the field in the same study. The Cuadra 
study shows the article “Basic Postulates and Common 
Syntax” which appears in this volume as being cited by five 
major textbooks in the field. 

_ However, the book cannot be considered as a text in the 
field in the sense of providing a survey of the entire field 
and fundamental] information on its key aspects. It does, 

nevertheless, represent the best (according to the reviews 

: surveyed) and probably the most frequently cited of 
Fairthorne’s works. The theme which ties the paper together 
is probably best described in Fairthorne’s own words taken 
from the preface: 


For some millenia, librarians have had to deal with texts 
as carriers of concepts, and with texts as heavy objects 
with marks on. They have evolved efficient techniques 
and principles to cope with these aspects severally. Rarely 
have they discussed texts in both capacities at once. 


The selection of papers published here explores activities 
in which indefinite neglect of either aspect, the conceptual 
or the mechanical, will lead to practical and theoretical 
disaster. They centre on the recovery of records according 
to their subject matter. 


The articles explore various areas of documentation, 
analyze and criticize existing systems, and seek new insights 
for blending the conceptual with the manipulative. 
Throughout the papers, Fairthorne’s intent appears to be 
that of raisin rob lene and drawing attention to them 
rather than offering solutions. In an introductory section 
entitled “Comments,” Lea M. Bohnert states that the best 
introduction to the field and to the author’s general ap- 
proach is the paper “The Pattern of Retrieval” originally 
published in American Documentation in 1956 and reprinted 
here. Fairthorne indicates the nature of his concerns when 
he writes, “A deep question of great theoretical and prac- 
tical importance is how far can we go in documentation, as 
in computing. by using ritual in place of understanding?” 
On notation: “The bridge between the concepts and physics 
of retrieval is notation, or systems of marking the texts.” 
“They [librarians] have given little attention and have 
had little need to give attention to the mechanical conse- 


quence of notation considered as instruction for retrieving 
rather than recognizing documents.” Fairthome spends some 
time in discussing the classification of tasks in information 
work, especially those which, in his words, may be “dele- 
gated” to the machine. “Fortunately,” says Bohnert, “Fair- 
thorne belongs to the economic breed that considers it effi- 
cient to have human machines perform the unusual and 
variegated es of work.” Fairthorne does not appear to 
expect classification to solve the problems of retrieval as 
seems currently fashionable. In fact, he does not seem to 
expect much from classification at all, in spite of his sev- 
eral writings on the subject. 

Another of his major concerns is that of cost. He states 
that theory can be used to produce a fair estimate of costs 
when we study “all the links in the operational chain.” 
But: “The theory can give only the least cost of clerical 
operations. Evidently the greatest cost depends only on 
what the author of the system can get other people to put 
up with. In practice, the limit seems to have been reached 
by the time the entries needed for retrieval exceed those in 
the documents to be retrieved.” He believes that models 
of document retrieval systems should be used for experi- 
mental study before more money is spent on expensive 
varieties of retrieval machinery. 

On the whole the volume is not easy reading. Much of it 
is theoretical and requires slow deliberate concentration for 
comprehension, and even then some is elusive. Fairthorne 
works through what he has to say with precision. His mathe- 
matical interests and ability are evident in the many dia- 
grams and formulas, To the mathematically untrained the 
volume appears rather frightening by its not infrequent 
complicated passages. 

Yet Fairthorne cannot be criticized for being deliberately 
obscure, or unnecessarily complex. The writing is straight- 
forward and lucid. He apparently attempts to write with 
great clarity — so much so that he often achieves a dis- 
arming simplicity in his statements. His tendency to reduce 
complex notions to ordinary terminology is often evident — 
"marking" and parking as the two piyaa methods for 
organizing information for the retrieval process. He is often 
amusing or witty. When he is critical, his criticism is often 
biting, as in the opening of his article on "Delegation of 
Classification.” 

This volume is a most valuable contribution to the litera- 
ture. That librarians have failed to appreciate Fairthorne 
can, according to Vickery, be attributed to a number of 
factors: “By and large, librarians are concerned to empha- 
size the intellectual content of their work, and display a 
marked psychological resistance to & description of part of 
it as ‘clerical’ and capable of performance by automation. 
It almost seems that they spurn labour-saving devices, 
despite their constant complaint of overwork. They have 
a fear of automata to overcome. They should ponder 
Fairthorne’s words ‘automatism is merely remote control in 
time." Vickery adds that Fairthorne has never actually 
participated in building a retrieval system. “In short, he is 
a theorist, and suffers the usual fate of lack of understand- 
ing by ‘practical men.’ ” 

The index is by Calvin Mooers. 

While much of the material is not recent, most of Fair- 
thorne’s questions are as valid today as they were when he 
first raised them. This is true mainly because the author’s 
concern is with basic theory, and not with descriptions of 
current practice. 

MARGARET LINN 


4/66-3R Indexing Theory, Indexing Methods and 
Search Devices. 1964. Frederick Jonker. Scarecrow, New 
York. 124 pp. 


Frederick Jonker’s chief purpose in writing this book was 
to give a full exposition of a “generalized theory of index- 
ing” which he had begun to develop a few years earlier. The 

ression “generalized theory” may understood as 
referring to the process of describing a group or series of 
events in words sufficiently general to encompass all aspects 
of those events, and sufficiently specific that the description 
is recognizable as being uniquely of those events. Some 
groups of events lend themselves quite readily to such treat- 
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Е signing an 


ment by exhibiting many. characteristics in common. An 
author then says that he has formulated а theory, because 
he haa noticed the common characteristics. Needless to say, 
once a theory has been formulated, it is very easy to see 
subsequent events ав operating within 
deed, sometimes it is almost impossible.not to see them 
that way. Moreover, when events are described in the 


terminology of the theory, "they seem subtly to change in ` 
^ character to fit it. 


Jonker points out that the only ‘valid ait Тог de- 
information retrieval system is cost: "bow to 
deliver a specified quantity, quality and speed of service at 
the lowest possible cost." Since the initial indexing, and not 
' the entering of data or the search, is by far the greatest 

. cost factor, an analysis and understanding of this part of 


я “һе I. R. structure is the prime necessity: However, since he 


is attempting to provide the common precepts by which 
«individual systems can be judged for their suitability to a 
‘particular I. R. p proie, Ше ihe author has formulated his 
theory to cover all aspects of the gystems. 

"^ Jonker кеа ше. erelo ment of his theory by defining 
mechanized I. R. by coincidence of terms.” He 
proves this by рар that all the logical relation- 
~ ghips among indexing terms which & system may be re- 

: que to provide may be reduced to readout functions, 

Р coincidence-of-terms search, or to a combination of the 

‚ two. He does not, however, limit his iheory io в description 

of these activities. He points out that most systems in 
actual use are: combinations of hierarchical: or classified 
grouping with term coordination, and therefore attempts 
to encompass both. 


"The two basic’ factors in any index are the kind of ter- 


minology used, and the ways in which the words-are made 
to relate to each other and to indicate relationships among 


2. Ње concepts embodied in the information store. 


For the first of. these, Jonker postulates a “terminological 
.continuum" which he conceives as a direct function of the 
.. development of knowledge. He represents it schematically 

` as а straight line proceeding from left to right. It is his 


contention that the language of a field of knowledge де-” 


: velops from longer to shorter terms. When a new concept 
ів born and reco ed, words are taken from several older 
concepts to describe it. He considers this the left end of the 

. continuum. As the new concept becomes accepted and 


. widely used, and in turn forms the basis for further develop- . 


ments in the field, new and unique words are to. de- 
| scribe it. Sometimes two or more older-concept words are 
simply combined to form опе; sometimes they are hyphen- 
ated into a single inse arable expression; in other cases, a 
.. new word is coined. Since this is a natural evolutionary 
" process, it cannot be depended upon to happen consistently, 
‚ or at а particular rate of speed. It does not eliminate am- 
. biguities caused by synonyms, homonyms, and ahades of 
accepted meaning when the same word is used in different 


but related fields. It does not obviate the problem that dif- . ' 


ferent people in referring to the same concept will use words 
from different stages of development of the vocabulary. For 
greatest precision, therefore, an indexing system should, in 
principle, assign a unique word or code indication to every 
unique concept.. This is the extreme right end of the 
continuum, 


. . Buch accuracy can be achieved only at great cost. In ' 
| practice, the theory has two lessons for the system designer. - 


He must be aware of the level of the vocabulary develop- 
; ment (within the field with which the system deals) of the 
, users of the system. He must ајво understand the language 
: of the body of literature to which he is providing access. 
. His job is to create a bridge between the language of the 
. user and that of the system and, then, through really ef- 
fective indexing, between the system and Ње literature. 
` The first span may consist simply of a list of the index 
terms used; it may be in the form of-a thesaurus; or it 
шау be a translation mechanism built right into the 


· · machine. . 
The author puts forward cogent argumehts against the 


possibility of & universal indexing vocabulary, applicable 
‘to all fields and all users. He claims that there is no 
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its framework. In- 


-zontal might 


some’ cases, the use of something may be the best way to 
describe it. In others, for other péople, the structure of that 


' game thing may be more important. 


Jonker postulates a “connective continuum” to describe 
the historical development of ways of 
among concepts indexed. At one end is the classification 
system where a term is placed with others to which it bears 
a hierarchical relationship. This produces very long index 


- terms, since’ each one carries with it all its relatives. At the 


opposite end is the keyword technique. Here, the only thing 


owing relationships ` 


standard criterion upon which to base such a language. In - 


that can be discovered is what other items "of information ' 


are stored in the same document or what other documents 
bear the same information. This produces the shortest index 
terms. Between the two ends is 
overlaps both extremes by frequently giving some hier- 
archical indication, while also giving several subject head- 
ings for a particular item of опена. The greatest po- 
tential for “indexing depth” (defined by the author as.the 
number of criteria by which an item of information may 
be indexed) is at the short-term end of the continuum. 
However, since an item of information is entered at this 
end only on the hierarchical level on which it appears in 
the literature being indexed, it will be lost in a search by 
a word applying to a higher or lower level. Searches must 
be made on various levels if generic information is sought. 
On the other hand, the short-term end can handle ideas at 


deemed necessary can be used to describe them, with no 
need to fit them into a preconceived pattern. The system 


. designed at the short-term end of the continuum is, there-. 


fore, inexpensive (relatively) to feed, but may incur great 
expense in search time or coordination mechanisms at the 
output. A classified system is more ‘expensive to feed, and 
may lose new ideas by erroneously ин them in 
hierarchies in which they are later found not to belong, but 


should be the simplest and cheapest at the output. 


in на the two continua, any I. В. system might be 

view oint on & two-dimensional plane. Тһе hori- 

ht be considered the indexing type moving from 

classification to keyword, and the vertical the ear SEM 
type, moving up from lay to professional language one te 
orter terms). The decision must always be zin e at w 


point along each of these lines a particular system Ud 


operate. Lines drawn from these points, апа perpendicular. 
to the axes, will intersect аба point which may 
define the gystem. 


Developing his theory further, Jonker goes on to analyze | 


the mechanization of I. R. systems. At the present time, he 
feels, the niost. important time- and labor-eaving functions 
of mechanization lie in faciliating term correlation. Other 
operations, such as automatic encoding, printout, etc., are 
simply added benefits in more complex systems. 

For students of information systems, perhaps the most 


. valuable section of this book is the chapter entitled “Pri- 


mary design consideration" (pp. 90-114). Here, the author 
gives & clear discussion of existing commercially available 
systems from the point of view of their suitability to par- 


ticular I. R. needs. Working from his theory, he considers , m 


the efficiency with which they accomplish term correlation 
with respect to the way they handle three basic operations: 


store organization (document or term grouping) 
matching (simultaneous or sequential) 
access-to the store (single or multiple) 


Using this gauge, eight basic types of systems are possible, 
in the form of different combinations of these operations: 


The author gives examples of systems embodying each of ; 


the combinations. He gives excellent didgrams which 
demonstrate the principles by which they work, and which 
are far more valuable than the photographs in; for example, 
Bourne’s Methods of Information Handling. There is also 
a list of sources of supply, but with none of the valuable 
cost information given by Bourne. 


Jonker seems somewhat overly Сарны with the. 
‘theoretical approach, He ‘makes the following statement 


А 


e subject heading list that. 


e said to 


‘all levels of their’ development, since аз many terms ав . 


about the machine "art": "In developing his design, the 
designer proceeds from the most fundamental considera- 
tions available to him to considerations which are usually 
of & less abstract nature, and from there to design details" 
(p. 85). This is a theoretically sound approach, but, in 
practice, if the process takes place, it must often be some- 


where below the conscious level. Nevertheless, this book has 
much valuable material for the student, and the "general 
theory" may be at least a way to analyze the gystems avail- 
able when & potential eonsumer must decide which one 


Suits him best. . 
Ертн Warp 
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CALL FOR PAPERS 


The 29th Annual Meeting of the American Documentation Institute will be held in Santa Monica, 
Califomia, on October 3 through 7, at the Miramar Hotel. 


Theme of the meeting is ''Progress in Information Science and Technology." We will welcome 
papers reporting on original research, significant trends, and new concepts, techniques, and ap- 
plieations of information science and technology. You are cordially invited to help create an in- 
teresting, varied, and informative technical program. 


Accepted papers will appear in the Proceedings, to be-available one month before the conference, 


^ апа will be the focus of individual Author Forums. 


` An award will be given by ADI for the three best papers submitted for the meeting. To insure 


maximum visibility for exceptional work, these papers will also be read by the authors at a spe- 
cial plenary session. 


Contributed papers may be up to 2,500 words in length and may include illustrations not to exceed 
two printed pages. Papers should be accompanied by a 100- to 125-word abstract. Five copies of 
paper and abstract should be submitted by May 15, 1966. Contributors will be notified regarding 
acceptance of papers by August 1, 1966. 


Send To: Dr. Carlos A. Cuadra 
' Technical Program Chairman, 1966 ADI Meeting 
System Development Corporation 
2500 Colorado Ave. 
Santa Monica, Calif. 


PROGRAM OUTLINE 
TUTORIAL SESSIONS — October 3, 1966 


Information Systems Design — RM. Hayes — UCLA 

Information Center Operations — A. Kent — University of Pittsburgh 
Usage of Information — S, Hemer — Hemer and Co. 

Evaluation of Hardware and Software — To be announced 

Language Data Processing ~ Н.Р. Edmusdson ~ SDC 

Development of a Theory — D. Hillman — Lehigh University 


STUDENT PROGRAM ~ October 3, 1966 


Special Session — Student Papers 
Panel Discussion — Student Chapter Activities 
Cocktail Hour 


J. Harvey — Сћашпап Student Membership Committee 
PROGRESS REVIEW SESSIONS — October 4-7, 1966 


Professional Aspects of Information Science and Technology - R.S. Taylor — Lehigh University 
Information Needs and Uses — H. Menzel — New York University 

Content Analysis, Specification and Control for Document Retrieval Systems — P. Baxendale — IBM 
File Organization and Search Techniques — D. Climenson — U.S. Government 

Man-Machine Communication — К.М. Davis — Dept. of Defense 

Evaluation of Indexing Systems — C.P. Bourne — Programming Services, Incorporated 
Automated Language Processing ~ R.F. Simmons ~ SDC ; 

New Hardware Developments — M.E. Stevens — National Bureau of Standards 

Information System Applications — J. Baruch — Bolt, Beranek and Newman 

Library Automation — D.V. Black — University of Califomia, Santa Cruz 

Information Centers and Services — G.S. Simpson — Batelle Memorial Institute 

National Information Issues and Trends — J. Sherrod — Atomic Energy Commission 


SPEGAL FEATURES 


Author Forums Special Interest Groups 
Discussion Groups Proceedings 

Prize Papers Award of Merit 

Special Libraries Association Session Exhibitor Presentations 
Placement Service Tours 

Exbibits Evening in Disneyland 
Information Theater Buffet Luau се 


Chapter Offioers Workshop ` M 


Address additional inquiries c/o 
Technical Program Chairman 
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AS WE MAY THINK 
by VANNEVAR BUSH 


Ав Director of the Office of Scientific Research and Development, Dn. Vannevar Вози has coórdinated the ac- 
tivities of some six thousand leading American scientists in the application of science to warfare. In this significant 
article he holds up an incentive for scientists when the fighting has ceased. He urges that men of science should then 
turn to the massive task of making more accessible our bewildering store of knowledge. For years inventions have 
extended man's physical powers rather than the powers of his mind. Trip hammers that multiply the fists, micro- 
scopes that sharpen the eye, and engines of destruction and detection are new results, but not the end results, of 
modern science. Now, says Dr. Bush, instruments are at hand which, if properly developed, will give man access 
to and command over the inherited knowledge of the ages. The perfection of these pacific instruments should 
be the first objective of our scientists as they emerge from their war work. Like Emerson’s famous address of 
1837 on ““Гһе American Scholar,” this paper by Dr. Bush calls for а new relationship between thinking man and 


the sum of our knowledge. — TuE Epiror 


war in which all have had a part. The scientists, 
| burying their old professional competition in the 
| demand of а common cause, have shared greatly and 
learned much. It bas been exhilarating to work in 
effective partnership. Now, for many, this appears to 
| be approaching an end. What are the scientists to do 
t next? 

For the biologists, and particularly for the medical 
scientists, there can be little indecision, for their war 
work has hardly required them to leave the old paths. 
Many indeed have been able to carry on their war re- 
search in their familiar peacetime laboratories. Their 
objectives remain much the same. 

УН, is the physicists who have been thrown most 
violently off stride, who have left academic pursuits 


Te has not been a scientist's war; it has been a 


П 
1 
I 
D 


- for the making of strange destructive gadgets, who 


edge of his own biological processes so that he has had 
& progressive freedom from disease and an increased 
span of life. They are illuminating the interactions of 
his physiological and psychological functions, giving 
the promise of an improved mental health. 

Science has provided the swiftest communication 
between individuals; it has provided a record of ideas 
and has enabled man to manipulate and to make ex- 
tracts from that record so that knowledge evolves 
and endures throughout the life of a race rather than 
that. of an individual. 

There is a growing mountain of research. But there 
is increased evidence that we are being bogged down 
today as specialization extends. The investigator is 
staggered by the findings and conclusions of thousands 
of other workers — conclusions which he cannot find 
Lime to grasp, much less to remember, as they appear. 
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Editorial 


овар years after publication of “Ав We Мау Think,” и the seminal publication for the 
field of Documentation, it is interesting to review the statements and positions it makes. In a recent conver- 


-sation Dr. Bush noted with regret that the “MEMEX” idea remains far short of accomplishment. Knowing, 


as we do, that many of the devices and techniques suggested in “As We May Think” (copying techniques, 
instant photography, microforms, associative memories, character recognition, etc.) are in being, why does he 
feel this way? > М 

The answer may well lie in the spirit of our inquiries and our research. Dr. Bush suggests more, much 


well worth the moments they take to те; a 


more than gadgetry in “As We May Think.” The У sonelading: oe of that paper which follow are’ 


In the outside world, all forms of intelligence, whether of sound or sight, have boni. diced 40 he | 


' form of varying currents in an electrical-circuit in order that they may be transmitted. Inside the human 
frame exactly the same sort of process occurs. Must we always transform to mechanical movements in 
order to proceed from one electrical phenomenon to another? It is a suggestive thought; but it hardly 
warrants prediction without losing touch with. reality and immediateness. 


Presuinably man's spirit should be elevated 3f he can better review his shady past and analyze more 


completely and objectively his present problems. He has. built a civilization so complex that he needs 
to mechanizé his Теле more fully if he is to push his experiment to its logical conclusion and not merely 
become bogged down part way there by overtaxing his limited memory. His excursions may be more 


enjoyable if he can reacquire the privilege of бөлегі: йа manifold ings he does not need to have . 


immediately at hand with some assurance that he can find them again if they are important. 


` The applications of science have built man a well-supplied house, and аге: teaching him to live healthily ' 


therein. They have enabled him to throw masses of people against one another with cruel weapons. They 
may yet allow him truly to encompass the great record and to grow in the wisdom of race experience. 


Не may perish in conflict before he learns to wield that record for his true good: Yet in the application - 


of science to the needs and desires of man, it would seem to be a singularly unfortunate stage at which to 


* terminate the process, or to lose hope : as to ihe outcome. 


Anriton W. ELIAS 
Editor, American Documentation 


1 V. Bush, As We May Think, The Atlantio Monthly: 101—108 (July 1945), 
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Coming of Age іп Academe—Information Science at 21 


| 
1 
i 


| 


The little tyke was conceived during the fleeting affair 
between that somewhat shabby and passive old maid, 
librarianship, and the rich playboy, sctence. The poor 
kid’ hasn’t even been named yet; tn fact most of us 
don’t even know what he looks like. Still, they were 
ali talking about him at the FID (International Fed- 
eration jor Documentation) Congress in October in 
shington. Some of the family felt that he ought to 
belreared by his mother in her flat on the outskirts of 
Academe. The others said that his father could give 
him a better life, with prestige, wealth, and status, on 
s best street in town. They were already calling him 

y his father’s name, hoping he would be known as 
b njormation Science.” (1) 


Yatormstioh science was fathered, not by “the rich 
playboy, science," but rather by the concern for informa- 
tion transfer. It was, to be sure, born in the house of 
science. | Its birth trauma occurred during World War II, 
and the|'birth announcement appeared in Atlantic in 
July 1945 (2). "As We May Think” has, in 21 years, 
generated'so much thought and action that its author, 
Vannevan Bush, might himself be called the father of 
information science. 

Bush, more than anyone else in this country, had been 
concerned with хуаув to facilitate the transfer of scientific 
information toward the very practical goal of winning the 

' WAT. This’ done, he turned to the extension of these 
means to the service of research more broadly, He called 
for the creation of new tools, using then-existing tech- 
nology, which would free the researcher from enslave- 
ment to repetitive rote operations. The system he 
envisioned, the “Memex”, would be devoted to expanding 
the memory land the other intellectual processes of the 
researcher. Little has been done in these 21 years toward 
making the Memex a reality. But the concept Bush pro- 
posed, of а te&hnology to serve the intellectual processes, 
gained momen ùm. What began as technology has become 
science and technology. Berry is right in calling informa- 
tion science a tyke” only in the sense that the name is 
new. 3 

А few annual mening ago (1963), the “documenta- 

lists" of the American Documentation Institute gathered 
under the banner of “Automation and Scientific Com- 


munication.” The next year the theme was “Parameters 
а \ 
\ 


| 


D 
1 
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of Information Science.” The change in emphasis from 
science information to a science of information has left 
some observers wondering where ADI stands. There are 
those, including some members, who believe that no true 
science will emerge, that the field will always remain a 
collection of disparate disciplines coming together to 
solve practical problems. However, this year’s annual 
meeting theme sounds a far more optimistic note—‘Prog- 
ress in Information Science and Technology.” 

No one would contend that information science is a 
mature discipline. Berry reminds us [with a nod to 
Hoselitz (3)] of three conditions attending-the birth of 
а new discipline: 


Problems: The existence and recognition of a set of 
new problems that attract the attention of several 
investigators. 

Generalizations: The collection of sufficient data to 
allow promulgation of generalizations with broad 
enough scope to focus on the common features of the 
problems under investigation. 

Recognition: The attainment of official or institutional 
recognition of new disciplines. 


The problems, while still ill defined, are certainly recog- 
nized. 

The generalizations do not come easily, and probably 
should not. But a great deal of data exists, and the data 
base is growing. The race to find solutions to pressing 
practical problems has delayed the development of uni- 
fying principles, but there is increased interest in such 
unification. This is evidenced by the creation of the 
Annual Review of Information Science and. Technology, 
the first edition of which will be available at the ADI 
October annual meeting. This ADI project, supported 
by the National Science Foundation and System Develop- 
ment Corporation, has brought together for critical 
review the literature of 12 areas of special interest: 


Professional Aspects of Information Science and Tech- 
nology: 

Information Needs and Uses 

Content Analysis, Specification and Control for Docu- 
ment Retrieval Systems 

File Organization and Search Techniques 

Automated Language Processing 
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Evalüation of Indexing Systems 
New Hardware Developments 
: Man-Machine Communication `- 
Information ‘System Applications 
Library Automation ` 
‚ Information Centers and’ Services - P 
. National Information Issues.and Trends . 


viewers. - 


' The third condition, official recognition, has” been: ex- cC 
` tended to.the problem, ‘but not to the discipline. Van- ` 


. ‘nevar Bush's position during the war is evidence of this. 


. From thé 1958 Baker Report (4) to last year’s COSATI `- | 
-study (5), blue-ribbon panels have recognized the need : -` 


-. for a cadre of specialists; and some have in effect advo- 


- cated the creation of the discipline, but none has treated ; 


either às an. accomplished fact. 


. Institutional recognition is Sion in the iud in, 

т tlie number of universities now giving courses and degrees wr 
-m information science--a growth go- rapid that а recent `’ 
editorial described it as, "the education explosion" (6). 


At the 1964 ADI Annual Meeting/a large number of 


i contributions were concerned with education (7), and` 


one paper addressed itself to the problems of accredita- 

. tion for the schools’ teaching information science. 

. 'The custodial disagroement over "information science" 
seen by Berry is a sign that several segments of the in- 

: tellectual community find thé “new science" a desirable 


„ member of their families. Fortunately, ideas need об.” 


` be the exclusive’ property of any one group; and while 


. ihe fight over custody will, no doubt, continue, informa- ' - 


.. tion science will probably split off into various disciplines. 
` Some segments have found homes in schools of science 


“and. technology, like-Georgia Tech; some, in schools’ of 


librarianship like Western Reserve and UCLA: It seems 


` appropriate that the theoretical: aspects appear to ђе | 
` prospering in.a philosophy department at Lehigh. The . 


philosophers have, after all, shown & strong interest in 
the logical and episteimológical aspects of. this “new” 
‘field. oe 


; Many’ of. the achools idis established ADI: student - 


| lanier: which will participate in this year’s national 
7 meeting (and those of the future), with & special pro- 
"gram to be inaugurated for student members, including 


эе award of student prize: papers; these will parallel . 
Ше awards "made for outstanding papers contributed by . 


; the regular. members. - 


a? To Hoselitz' 8 criteria, as reported by Berry, we они : 
add three more, as measures of maturity in a discipline. . ` 


"Ore is responsible judgment. In the post-World War П 
: dave, emphasis shifted away from the small system that 
Bush envisioned to serve.the individual researcher. The 


"accent of information technology was placed on the crea- | 


tion of large, centralized files, such as Ше of the defense 
, and space agencies. 


- There was ists we now consider an Almost touching џ 
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ё Mos *& We ЖОР, 
, i ы 


naivete about the ability of the. pat computer to- p Du 
“the problems -of information transfer.. А; few. voices of ~ 


dissent уеге heard even “during the heyday of eentrali- .. 


~” gation. Douglas Engelbart (8) іп 1660 expounded the ^ 
> .. idea. of inicrodocumentation, urging fürther development | 


‘of small: systems to aid the individual: Bu along the. э. | 


` Ава further effort toward deriving general. principles; E ' lines of the;-Büsh. concept. 


_ the: annual meeting will feature Progress Teview sessions, - 
based. on these ‘subject fields. The sessions will be panel. . ` 
"discussions, ш the shapier authors | ав . piineipal . те, 2 


: Having pioneered-i in: development ot due system, de a 


-Defense Department , some. ‘time ago began’ seriously Ло. 
' question their ‘value; and’ „the current DOD. emphasis. 

‚оп decentralization and: specialization: is the тези. The- ` 
7 computer-based large. Ше: at, Defense. ‘Documentation’ 7 
Center now does what. it сап do: -well, which i ig secondary ` 6 


distribution, while information ‘centers hold out promise. 
of performing well a wide-range of information services. . 
Those services are related more directly to the. com- 


"plex ` intellectual operations . that are’ part of research. 


In this, they are more in keeping with the spirit of Bush's 
initial plan--technology i in the: service of.the: individual 
researcher—and they bring to the researeher, ше А. i 


'-realberiefits of large-file operations. . . 
"The return to the needs of the working дет is 
' evident in a resurgence of interest in use studies. These .-. 
.&re nothing new, of course. Librarians have been making . 

them for years, usually in connection with attempts to  . 


determine the superiority of one form of catalog over 
another. Mortimer: Taube (9) had по use for such 
studies, which to him made ав much sense as tlie doctor 


| asking the patient, what treatment the patient wanted; 


but the current interest in use studies seems little affected.. . 
by Taube’s arguments. Concern with the empirical study.. ` 
of information use patterns, rather than with easy gen- 


eralization and a priori system-büilding, is a sign of." 


health in a discipline. Tt is probably & sign of youth, . 
but hardly of infanülism. It must be accepted as a 
sign of considerable maturity that information scientists, 
and technicians now show a:tendency to ‘abandon the: 


d belief in the easy answer, and again address , a 
to the complexity of information transfer. ... | 


A second added criterion of а discipline's n maturity ig 


-the adequacy of its educational structure. For informa- ~: f 


tion science, drawing on many subjects, it is’ difficult {о ^ 
know what should be taught, and by whom. Graduate . . 


. schools are now ‘experimenting with many. kinds of cur-. ` 
. rieula, but course-structuring is difficult. for ‘specialists: 


whose backgrounds differ widely. Several ADI chapters. : 
are offering short courses. The 1966 Annual Meeting ~ ^ 
will itself present such ari effort, in the form of tutorial .. 

sessions Оп six general subject areas: 


Information 8 Design 
Information Center Operations 
Џваре о Information , f sigh 
Evaluation of Hardware and Software’ x 

. Language Data Processing к. 
Development of a Theory. : 


А third mark of а scientific discipline i is the free fow | 
of ideas under critical Scrutiny. Research in information . 
systems has in the past had в certain aura of the occult, 





ж-е 





4 


for many reasons: perhaps the most important two are ' 
- Security regulations and the. poor. definitions of ‘infor- 
mation' system concepts. ‘The obscurantism that resulted `> 
„has sometimes been deliberate, sometimes an “honest 
failure in communication; in either саве it has stultified ` 
'eritieism.. The promotion of a more- critical attitude 
will be enhanced by the publication of the Annual Review- 


and the presentation of the progress review ‘sessions. 


Increasing. educational opportunities. have, already had 
- this effect: f E 
: Until recently; one potential but reluctant source. for 


valuable criticism was. the librarians; for the most part, 


they hung back. Some held in contempt what they” 
considered a barbaric’ new technology. Some feared 
But more and more of them have. 


it, and some still do. 
come to understand the technology, or at least what; they 
need to understand in order to use it. As information 
science deepens its theoretical basis; the gulf between it 
and library science grows narrower, particularly with the 


. special and technical librarians in information technology, 
‚. Who design and run many of the Systems. Аб the ADI 


meeting in October, the Special Libraries Association will 
sponsor & session on "User Reactions to Non-Conven- 
tional Systems" designed to promote understanding and 
encourage criticism.. : 

Such results are encouraged by the fact that, within 
the past, year, the American ‘Library Association has 


`+ -ereated a Division of Library Automation and Informa- 


tion Science. With three-quarters of ALA-accredited 
library schools now teaching courses in -information 
science, the librarians in general, rather than only techni- 


„cal librarians, are finally beginning to engage in a dialog 
that will certainly improve both the free flow of. ideas- 


and the critical scrutiny of those ideas. | 

Information Science has gone beyond its childlike 
preoccupation with electronic toys. ‘The problems it has 
identified are generally · recognized; 


P journals now discuss the "information explosion.” The 


information scientists have amassed a considerable body. 
of knowledge about the properties, behavior, and flow of 


| information. 
А discipline i is in the offing. 


even the popular 


Official and academic. кышы of the 


“Twenty-one years ago Vannevar Bush announced to 
the world the birth of a new science. The American . 


‘Documentation Institute, at its Twenty-Ninth Annual 
. Meeting to be held in Santa Monica, California, October 
` 3-7, 1966 will point with pride to the “Progress in Infor- 


mation Science and Technology”: that has taken place 
since this historie announcement. 
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- Information Soience. 
-be delayed until the "analysis of ezch- коога curriculum сап be те - 
b “viewed by each school’s Aia um 


tad 


Librarianship and the Science of Information * 


“Utilizing query responses from accredited . 


mine’ emphasis on various subjects, with special refer- 
ence to Information Science (IS)—i.e., 


the bibliography of science, ‘operation of scientific in- 


‘formation services, etc. Generally, schools -continue to ` 
22 stress subjects related to operation of public and school 
Certain library schools are developing рго- · 
grams and/or courses in IS and 51, and in some cases : 


libraries. 


the two are combined. Many schools now. offer some 
, training in nontraditional techniques. Increased stress 
on science information and the addition of courses in 
the new information technology have not 'radically 
altered. the theoretical. structure of library education. 


| But the Information Science approach promises to con- 
tribute greatly to that structure. Librarianship possesses . ` 


У 


- This is the report of ‘a. continuing study of information 


‘science education in graduate library schools in the U.S. 
and Canada. The purpose of the study is Ло find. out 
"what ів the extent and nature of each school’s involve- 
“iment in information science education, and to identify, 


f possible, a core of theory common to the curricula, as 


| well "as differences. in the approaches of the respective 


жолоо: 


4 


* Information: Sciences 
— Information science is frequefitly confused with science 
information. For-the purpose of the study science infor- 


mation courses. (Fig. 1) are those that (like science 


*Presented October. 10—18, 1965, at the International Federation for 
Documentation (FID), Washington, D. О. This study was begun in 


November 1963 and was first discussed in а short paper in the 1964 


, Proceedings of. the American Documentation’ Institute, Parameters ој 
Publication of detailed data on this subject will 
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library. 
schools, courses offered have been analyzed to deter- - 


the study of the | 
~ information processes—and Science Information (SI)— . 


. mation handling as applied to science. 
· _ Science, on the other hand j ig, to use’ Tob definition 


A: 


a long-established corpus of knowledge relatingto a 
- science of information. Librarians, traditionally service- `` 


oriented rather than ‘resèarch-oriented, have not 
exploited that body of knowledge for general principles 
of IS. Research should bè directed. to the areas of: | 

1. Cataloging and classification—the logical and epis- ` 


‘temological underpinnings of respective ‘systems; the 
. relationships of ` given systems to prevailing theory ‘of - 


knowledge. | 
2. The technique “о? 'reference йе its rich. = ` 
potential contribution to the theory of problém solving. | 


`З. The understanding of the social and institutional 
` framework of. the information community. 
.'schools need ‘now to accompany technical instruction 
with research into these and similar problems in-order EX 


Library : 


to contribute to the theory of 15 and to gain from IS. . 


. befter technique for the practice of librarianship. 


JOSEPH C. DONOHUE + 


` Informatics, Inc. LUE A | P Ж 
Sherman Oaks, California У 


нына аге РА with the iisque: of infor- 
Information 


The study’ of the properties, behavior; and flow of i in- | 
` formation. It includes - " 
(1) environmental aspects of information and сош- 
munication, | 
(2) information and language analysis, 
(3) the organization of information, 
(4) ‘man-system relationships. 


< * Study approach, 


„The РЕ of the study is this: -using catalogs, 


course ка је and in some cases additional Dr ЛЕ 


t The author wines to express thanks to the faculties of the various 
schools queried for their cooperation and encouragement; to colleagues 
at the General‘ Electric Company-TEMPO end The RAND Corporation, 
for providing technical and clerical assistance. Mr. William Way of - 
The RAND Library contributed өтелу, especially in analysis. of... 
statistical data. ` 5 + 


SCIENCE INFORMATION COURSES 





TECHNIQUES OF INFORMATION 
HANDLING AS APPLIED TO SCIENCE 


INFORMATION SCIENCE (After Taylor) 


PROPERTIES — BEHAVIOR — FLOW OF 
INFORMATION 


INCLUDES: 


A. Environmental Aspects of Information 
' and Communication : 


Information and Language Analysis 
. Organization of Information 
D.. Man-System Relationships 








Юа. 1. 


mation supplied by deans of schools, courses were divided 
into 21 classes by subject matter in four major categories, 
as follows: 


Category 1: Librarianship per se—its special tech- 
niques, and its professional and ethical aspects. 


TOTAL FOR 34 SCHOOLS — ABOUT 2950 HOURS 
MEAN PER SCHOOL 8 
RANGE 50-170 




















Category | Category || Category ll} Category IV 
Librarianship Librarianship Information Other - 

Applied Sciences Disciplines 
296 





Categories | & || 
9396 


Fia. 2. 


Category 2: Librarianship as applied to special 
types of libraries, subjects, and/or clientele. 

Category 8: The information sciences—the study of 
information, its properties, behavior and flow. 

Category 4: Courses from other disciplines—given 
in the school of librarianship but only indirectly related 
to librarianship. 


Only gross figures and some highlights will be presented 
in this report. The tabulation in Fig. 2 shows that of 34 
schools finally included, course offerings total about 2950, 
with a mean of 87 hours, a range from 50 to 170. 

In Categories 1 and 2 shown in Fig. 3, librarianship 
per se and librarianship as applied, the total, 93%, is 
about equally divided between them. The preponderance 
of courses in these categories is not surprising. 

In Fig. 4, Category 3, the information sciences, account 
for 5%, and Category 4, other disciplines, totals 2%. 

Schools giving at least one course in information 
science represent 77% of all schools. This is a significant 
growth from the previous year, of 30%, which is shown 
in Fig. 5. 

With regard to the number of credit hours in courses 
in information science given in the individual schools, 
two modes are found. Eight schools offer & total of 11 


‘hours, or about.three courses; an: additional 12 schools 


offer three hours, or one course. Most schools are ap- 
proaching information science very gradually, with eight 
schools accounting for 75% of all courses. Examining 


CATEGORY | LIBRARIANSHIP PER SE 
Percent of All Courses 
1.1 Background, History, Etc. 
12 Administration 
13 Selection and Acquisition 


1.4 Cataloging and Classification 


15 Technical Processes 
1.6 Reference, Bibliography 


TOTAL 


CATEGORY 11 LIBRARIANSHIP AS APPLIED 
2.1 School, Children's Librarles - 
2.2 Public 
2.3 Academic, Research 
24 "Speclai" Libraries 
2.5 Fine Arts 
2.6 Humanities 
2.7 Social Sciences 
2.8 Science & Mathematics 


н 
Slvo2:9-—9--x- 
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CATEGORY ИП > INFORMATION SCIENCES | 
| Percent of All Courses 
3.1 Use and User Studies: H s 
"3,2 Operation of Information Systems ` ue 


3,3 Theoretical Information Science ~ 
Analysis & Deslgn . 


TOTAL 


CATEGORY Iv OTHER DISCIPLINES 
41 Languages 055 
42 Mathematics 
4,3. Science & Engineering 

: 44 Social Sciences 





ef . "Fra. 4. 


the aggregate figures for all schools, some rather dramatic 


‘contrasts are found. 


In Category 1, classes in background and е lead 


` with 14%. Reference and bibliography follow with 1396, 
and cataloging and classification, 10%. This distribution 
ін especially interesting since cataloging and classification 
have been. generally considered the major intellectual 
discipline of the curriculum. 

In Category 2, those classes in which the basic skills 
are applied to particular kinds of librarianship, 18% of 
the total curriculum is devoted to.preparation for work in 
children’s and adolescents’ libraries, This is followed by 

_ publie libraries with 6%, by science—technical and 
mathematics (5%). Research libraries, social science li- 


braries, and humanities libraries are about equal (4% | 


each)., Special librarianship and fine arts are in the rear 
guard, accounting for about 2% each. 

In Category 3, the information sciences—the proper- 
ties, behavior, and flow of. information—the greatest 
number of hours are devoted to courses emphasizing 
operational description of nonconventional data process- 
ing and information retrieval. These courses, being for 
. most students their only introduction to such subjects, 

` make up 3% of the curriculum. An additional 1% is 
devoted to theoretical information science, including 


system analysis and design; and a further 1% is con- | 


` cerned: with use studies, Чишш study of conventional 
`7 card catalog use. 

Finally, in Category 4, among courses from other dis- 
ciplines, given in the, library school, social sciences ac- 
count for 1.5%; mathematics and language, 0.2% each; 
and science and engineering, 0.1%. 
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SCHOOLS GIVING AT LEAST ONE . 
INFORMATION SCIENCE COURSE | 


1964-1965 1965 - 1966 | 


8 Schools Offer 3 or More Courses Each 
or 75% of All Courses Given in АН Schools 


11 Schools Offer 1 Course 





Fia. 5. | 


• Analysis. | 


Тће relationships implicit jo vill. require much E: 
analysis. Here are some tentative conclusions: | 


1. Library schools retain their overwhelming em- | 
peu on training of personnel for publie and school `.’ 

braries. However, some attention is being turned to - 
' science librarianship, and to a considerably lesser ex-. 
tent to information science. 

2. Correspondence from many of the schools. indicate 
a strong desire on their part to strengthen the offerings 
in both science information and information science— 
but this desire is thwarted by lack of qualified in- 


'. structors—in both areas, but especially à in information 


science. 

3. As demand grows and instructors become avail- 
able; we may expect to see significant growth in both 
kinds of courses. 2. . 

4. The presence of. more scientifically oriented people . 
in the library profession may, over a long period of ` 

. time, indirectly affect the modes of thought of that 
group. ; 

5. A more radical change may be expected as & result 


of the notion of a Science of Information being devel- . 


oped in some academic circles, including some library . 
schools. Information science is by nature. research is 
oriented (which librarianship has not been, in this 
country at least). To the extent that librarianship and 
its gehools accept that notion, it has кааш к 


tial for affecting the currieulum. One library school 
dean, in fact, expressed the desire that, the information 
| science approach would be reflected in every course in 
the curriculum, not excepting children’s librarianship. 
Many librarians are aware of the potential benefit of 
ibis approach. What is perhaps less appreciated is the 
pontribution that librarianship has to offer to the theory 
of information science. І 


© Recommendations 


|. Та conclusion, the following three areas require research 
as shown in Fig. 6: 

б. Cataloging and classification: Implicit in the 
categories of classification systems, and in the tech- 
‘niques of cataloging are basic assumptions about 
knowledge and about our means of knowing. The sys- 
tems should be studied as both the creators and the 
creations of the environments in which they arise. 

2. The structure of the information community: 
Librarians, in search of information, explore and often 
describe in their professional literature the structure 
of the information community and the relations among 
its components. Such descriptions, properly studied, 
will contribute to the empiries of mformation science 
and, properly quantified can offer insights into the 
sociology of knowledge. 

3. Reference services: The descriptive literature of 
reference service, and the skills passed on in its oral 
tradition contain the explanation of why a reference 
librarian is often more effective in literature search 
than our most advanced computer program. A group 
at Hughes Dynamics (2) has studied these techniques 
step-by-step showing that such analysis can offer in- 
| sights on the heuristic process itself. 


In these and other areas librarians can offer a great 
body of information, hitherto accumulated in a prag- 


derive from it general principles, contributions to the 





matic and largely uncritical manner. They need now to . 
apply analytical tools to that corpus of knowledge—to : 


Research on THESE 


Cataloging and Classification — Logic 
and Assumptions 


' Structure of the Information Community — 
The Sociology of Knowledge 


Reference Services — The Heuristic Process 


Will Yield 


Contributions to Information Science Theory 


Better Tools for Librarianship 





Fia. 6. 


theory of information science, and to gain in the process 
better tools for the practice of librarianship. 
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E Library Catalogue Production on Small Computers * 


—- 


^ The paper discusses the. production of library cata- | output modes. The paper, also, treats. dt error éon-- 


5 * dogee cards, specifically tréating the Columbia- trols, human edit procedures, апсі the. complexity and - 
:Harvard-Yale Medical Libraries Computerization Proj- variety of present bibliographic organization in the 
ect.t ‘A description is provided | of the bibliographic context of computer manipulation. .. 


. data input, generet computer processing, and various : 


FREDERICK G. KILGOUR 


- Associate Librarian for 
Research and Development 
‘Yale University 


‘Particulars of experiences suffered in struggles to the ONULP catalogue. The first Columbia-Harvard-Yale 
design, write, апа coax into operation computer programs product. was a monthly accessions list that appeared in. 
.to process the bibliographie data on library catalogue October.1963 but was in upper. case characters (Fig. B 
2 "> cards do not constitute a sizeable literature. Indeed, it is ‘and it was not until Septémber 1964 that catalogue cards . 
doubtful that a literature can be said ‘to exist. Neverthe- bearing upper and lower-case characters began to be. 
- less, initial encounters with this variety of symbolic’ produced routinely.. 


manipulative programming suggest that there i is much to Small, medium, and large nd are used to 
'. be learned and reported. . process bibliographio data. СНУ émploys an IBM 1401; 
This paper will be constrained to a consideration . of ап IBM 1460; NLM, а Minneapolis Honeywell 800; and. -` 
s processing on general- -purpose computers with final out- · ОМОР, ал IBM 7094/1401 configuration. Stanford will 


put being in upper and lower case characters. Institutions ^ be using а 1401 and Santa Cruz, an IBM 1410. The Yale 
.most widely known to be engaged in such activity are the University Library is also designing programs to produce 
National Library .of Medicine (МОМ) (1), Florida | catalogue cards апа-а book-form catalogue; the cata- 
. Atlantic University (FÁU) (2), the Ontario New Uni- logue-card production programs will run on an IBM - 
' versities Library Project (ONULP) (3), and the Colum- ..7094-7040 Direct Coupled System, but it seems likely 
.bia-Harvard-Yale Medical Libraries Computerization that in the foreseeable future small computers will' be 


3 > Project (СНУ). (4). Stanford University and the Univer- .much more widely available to libraries than large | 
"sity of California at Santa Cruz are also developing machines. 
. systems for'book-form catalogue production. i . Moreover, thè oft-repeated dictum that “the larger the . 
б NLM produced the first computerized issue of the computer, the cheaper the processing” does not always 
Index ‘Medicus in January 1964, but the printing was in ~ hold for bibliographic data processing. Some programs . 
^. the upper‘case characters of a high-speed drum. printer. , ‘require prolonged reading of tapes during. main-frame Ж 
К Тһе July issue appeared i jn upper and lower case, having 4 time; &nd since {арез- can spin 88 fast; or nearly as fast, . 


| been composed on a high-speed chain printer. Since on small computers as on large, it is probable that pro- 
August 1964, the Indez Medicus has been set up on cessing is cheaper on a small machine whenever Чаре- 


. NLM’s GRACE (Graphic Arts Composing Equipment), spinning time equals or exceeds РЕ time on & 
“which produces the most handsome computer print-out large machine. =r 
' available, FAU’s book-form catalogue appeared in the . Most, but not all, of the observations repórtéd in this 


_ autumn of 1984,10 be followed in the spring of 1965 by paper, 2. Hom experienc боа са 


PUN rted in part by National ect. The ОНУ Project began in the autumn of 1961, 


. Setence Foundation Grant No. 179. following а suggestion by Lawrence Buckland, Ben-Ami 
f The Harvard Medical Library withdrew from -the Project а as of 80 


^ june 1908. © | 22 Lipetz, and David Sparks (al at ITER at that i 
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that it was possible to produce cataloguing information 
in machineable form that could be used in the production 
of card catalogues that could be accumulated over а 
period of several years to be put into & computer file 
when sufficient information was available to justify com- 
puterized bibliographie information retrieval. The pro- 
posal for a grant was made to the National Science Foun- 
dation іп 1962, and NSF made the award in the summer 
of 1963. However, Yale began keypunching cataloguing 
information early in 1963, so that all titles processed at 
Yale since 1 January 1963 are in machine-readable form; 
the other institutions followed soon after. As already 
noted, CHY began routinely to produce monthly acces- 
sion lists in October 1963 and catalogue cards in Septem- 
ber 1964. Details of these procedures may be found in an 
article entitled “Mechanization of Cataloguing Proce- 
dures” (4). 

In 1965, the Yale University Library initiated a project 
to computerize the procedures of the new Kline Science 
Library at Yale, looking forward to computerized bib- 
liographic information retrieval. However, the system for 
the Kline Library is being designed to encompass all 
libraries at Yale. Yale is also designing an acquisitions 
and in-process control system that will be computerized. 


‘The amm of this second system is to gain control over all 
processes from the time a requester asks the Library to 
purchase а book until cards are in the catalogue and the 
book is on the shelf. The first products of these two new 
systems are expected to appear in the late spring and 
early summer of 1966, but considerable experience has 
already been garnered. 

In the CHY system, the cataloguer catalogues each 
title on a worksheet (Fig. 1). The next person in the work 
flow (Fig. 2) is the keypuncher who punches the informa- 
tion on the sheet into punched cards. There is one 
punched card for each line on the worksheet, and the 
group of cards for each worksheet is called a decklet. 
These decklets are run through an IBM 870 Document 
Writer and listed off in upper and lower case for proof- 
reading. After proofreading has been accomplished and 
corrections completed, groups of decklets are taken to the 
computer for processing; the computer is an 8К 1401 
with two tape drives and modified to drive a 120-charac- 
ter, upper and lower case chain on the IBM 1403 printer, 
although the programs were originally designed for a 
two-tape 4K 1401. 

Five programs do the processing, and each loads its 
successor into the computer after it has completed its 
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iA Б | Fdited information on tape . 
Mn А Я ш (One record per дескјеђ 
"P у 7 8 $ 
' Edited information оп {аре 
(One record per final catalog card) 


1401 TTSRT Program 


E" e 


:1401 CHY 5 Program 





CATALOG | 
CARDS 


о 5 


(1403 printer) f 
Formatted "card Images" 
on tape 


CHY Catalog card production 









.Inferrecord gap . 


Fic. 2. Flowchart of catalogue card production  : se | \ 


2242 processing of the.data. The first program (CHY-1) edits . 
` *, + the data on each worksheet and writes the edited data. 
d onto а magnetic tape; it also sets up а sort control for 
each entry under which the card 18 to be filed in the card 
) catalogue.’ The second program (CHY-2) explodes the 
_ edited tape record: produced by the first program into a 
Bal total number. of tape records equivalent to the number of 
. . + catalogue cards that will.be,needed. The third program 
.(TTSORT) is а modified IBM package program, and it 


` `+ sorts the card records into various packs, each pack being ` 


destined for an individual. card ‘catalogue. The fourth 

| program (CH Y-5) sets up tape images of each card in 

: its final format, and the: fifth program (CHY-6) prints 

out the cards on card stock directly on the 1403 printer 

‚ (Fig; 5). If catalogue cards are to bé produced on an 

870 Document Writer-(4) instead of on a 1403 Printer, a 

22: ‘sixthprogram (CHY-7) replaces CHY-6. This program 

z 7 “punches cards on the IBM 1402 card read punch, and 

' these cards drive the 870. When punched cards for an 

870 are to be produced, the programs can be run on & 

- two-tape 4K machine, but CHY-6 requires an 8K core as 
/X. presently written. : i 

27 Each month, the decklets that, have accumulated dur- 


2а. 
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: is all in upper case. 


ing the month are run on another program to produce & 
monthly aecessions list; on two occasions, working copies 
of author-entry book-form catalogues have been pro- 
duced. ` However, like the accessions list program, the. 
book-form catalogue program. was written before the . 
upper and lower case chain was available, so the output ` 
These programs were written іп IBM's Symbolie Pro- 
gramming System (SPS), а symbolic-assembly language 
of the type to be preferred for coding programs to process 
bibliographic data on small computers. . With these.ma- ` 
chine-oriented languages it is possible to write efficient | 


and sophisticated programs that can take full advantage | 


of the capabilities of a small computer or, for that matter, 
of a large computer. And since runs are made almost 


` daily, it is important that processing be efficient to-keep. 


costs low. ©. | 


‚` ORBE-T the first program, edits each decklet апа. writes” 


a record on tape (Fig. 3a). It also sets up the sort con- 


-trol by which each card can be alphabeted for -filing 


withiņ its pack, and writes each sort field as a separate 
record following the long card record. CHY-1 formulates | 
the address of the heading in the long record at the begin-. | 


a 1YM62C000L . 


, O211KOELLNER GEORGE PAUL 19024 


i 1U0J 101 LACCENTUS MOGUNTIUSs 


[b 03 ACCENTUS MOGUNTIUS# 





RC 'HESSE “/ MAINZ + — AT3!I.'TITLE + . 


5 *10'DER 'ACCENTUS 'MOGUNTIUS 
-C0!M'L3082 


t 
c ü сс 
i 
! 2C0.'K77 


` | ; 2H2 O411CHURCH MUSIC — HESSE — MAINZ& -- 


K'sDLLNER, ‘GEORGE *PAUL*S 1902176. A20 "DER 


"&A*K'#OLLNER, "GEORGE 'PAUL'S 1902'/ 
=08 "DER 'ACCENTUS 'MOüGUNTIUS'. EIN 'BEITRAGE ` 


A00*M'L3082 .'KTT* A10*4'K' eOLLNER, "GEORGE "PAUL'S 1902*/* A20 ‘DER 'ACCENTUS 'MOGUNTIUS*. EIN 
"ВЕІТААСЕЖ AZOZUR 'FRAGE DES *1'MAINZER CHORALS.'1 3 ATMAINZ'5& л201950- + A40 202P. ILLUS. 30 
CHM.t ATLL.*CHURCH MUSIC '/ 'HESSE. 2 "MAINE +: АТЗ" 1. "ТИГЕ; * ^ 


. 1ҮМ6200001 1UO A00'M'L3082 .'K77* А10'4' 


* ACCENTUS ,' HOGUNTIUS'. EIN "BEITRAGE* A202UR 'FRAGE D 
ES *l'MAINZER CHORALS.'1  !'4'MAINZ2'54 A201950.# A40 202P. ILLUS. 30СМ.% ATL1.*CHURCH MUSIC '/ 


-082UR 'FRAGE DES '1'MAINZER CHORALS.'1  '4*MAINZ'5 


` «081550. 
=08 2C2P. ILLUS. 30СЫ. 
=00 Као 62 21 


А 


218: PS Tape dump of records written by edit program, CHY-1. Three sort fields are at the bottom. (b) Tape dump of 
ae produced by CHY-2. First two digits are pack number followed by alpha sort field, identification codes, and address . 
(1 70) of title with an A bit on the middle digit to indicate ig hoading a is an added entry. (c) Tape dump of card image set up 


for printing 


ning of each: sort field, aen for the main entry. By 


placing bits on the middle digit of the address, it indicates- 
whether the heading is а: topical subject, name subject, or’ 


added entry, information that CHY-5 will need: to deter- 
mine the location of each type of heading on its respective 
catalogue card. However, the programs have been run on 


two-tape configurations on which it has not been possible _ 
to sort alphabetically. As a result, the subroutines for. 


` setting up ‘the sort fields have not been completely 


deb швей. ў Е 
HY-1 requires over .5,900 spaces of core and i in its 
АК |version ‘is stored in part on tape. Some 1,600 spaces 





provide a work area, and place for control and processing . 
routines. Two groups. of subroutines, Phase B and Phase by 
A, df 2,388 and 1,953 characters are written on tape to be. 


called into core to overlay each other as required. The 


program also writes the title and series category of each . 
dec et between the two subroutines, because these cate- 
ips of data must be retrieved if a sort field for a title: ' 
` short: title, or series entry is to be set up. Since these | 





data are of varying lengths, the Phase A must be rewrit- 
ten ‘each time before the first is called (Fig. 4). - 

. The most dificult programming for processing biblio- 
| ius data is the setting up of sort controls. The com- 
plex | requirements for writing sort fields determined the 


size pf CHY-1. As Fig. За shows, punctuation and flags - ` 


are removed except for dashes; in the main entry in the 
illustration, the square brackets, indicating that the au- 
* thorf name is not on the title page, have been removed 


: as well as commas. "The program converts ampersands to ` 


‚ the qdrrect. conjunction for the language of the title page 


„and transliterates, if necessary, characters with diacritical .' 
marks into the equivalent letters in the filing alphabet.. 


“д” 


For instance, the “6 in the author's name illustrated 
becomes “ОЕ” in the sort field. It is also necessary to 


remóve: an initial article from the sort field as shown in 
the figure. These and similar manipulative routines are 


CcHY- | ~ 
Phase B 


| Stored | - дар ^ 5 
Stored | | 
Series |e -- je 


УУУУ, А 





Fia. 4. Map of CHY-1 system tape 
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‘space consuming and for boston же production 


production’ of éatalogue card packs: 


-In the original СНУ. sequence of programs that pro-. 


duce punch cards to drive an 870 (4), CHY-1 was .fol- 


lowed by three programs—CHY-2, TTSORT, and CHY- ` 


4. However, when the upper and lower. сазе -chain. 


became available, for which CHY-4 had not been de- . 


signed, it was necessary to rewrite that program in two 


‘sections. The design for CHY-2 also proved inadequate . 
and was rewritten. In their original form, CHY-2 and: 
‘CHY-4 contained generator programs that wrote short' 


_ series ‘of a half-dozen instructions at most constituting 
. several programs that the load program used. Data on 


a control*card determined the character of the generated 

` programs. This technique is efficient in use of space ала. 
'`: may augment speed if either or both are required ; а four- 
or ‘six-instruction -program usually , yields more -rapid | 


processing than a table search and is shorter than -sub- 
rotitines to be initialized with addresses and constants.. 


The three programs that presently follow the edit pie : 
gram are the second edition of CHY-2, TTSORT, and · 
. ,CHY-5, and these three programs use about: 11 500 


spaces of соте. Each contains an initialization or genera- 
“ог program varying іш. -size from 300-%о 860 characters 
that is subsequently overlain Ьу a work area. As already 
mentioned, CHY-2 expands the records it receives from 
CHY-1, into the number, of records equivalent to. the 


B number of 3x5 catalogue cards required for each title. `- 
. It accomplishes this expansion by combining the informa- ` 


tion received from CHY-1 as to the number of heading 


' eards required, the main-entry card, and information on . 


2 the control сата Тог CHY-2 giving the number of packs 
` of cards and the type of cards to go in each pack. This 


program is capable of setting up 99 packs, each contain- - 
ing. from one to five types of catalogue cards (main : 


entry, topical subject headings, etċ.) and will place the 
iracings for the headings at the bottom of each type of 
càrd as punched on the control card. CHY-2 also sets up 
the sort control Ѓог са numbers, assigns this control and 


: ‘the controls received: from CHY-1 to a sort field in the ^ 


new record, and assigns for each new record a pack num- 


^. berin ‘which the card will ultimately bé printed (Fig. 3b). 


"TTSORT is a slow, two-tape sort that processes digit 


А by digit, column by column: If four tapes were available . 
on the equipment at Yale, it would Ђе póssible to sort’. 


more rápidly by pack and to alphabetize the cards within 
` each pack. Next, СНҮ-5 writes on tape the image of tlie 


сата as it will be priàted oh the upper and lower case | 


chain or on the 870 Document Writer (Fig. 3c). : 
Either CHY-6-or CHY-7 follows CHY-5. As already 


о: hoted, the first prints cards on the 1403 and does so 


| either ‘side’ by side as ‘ghown- in Fig. 5, or one up, the 


choice of output being determined by the setting of a, 
sense switch. If CHY-7 is used. instead of CHY-6, cards 


‚ате punched on ihe 1402 containing the information from 
CHY-5; and these éards, as already noted, drive an 870. 
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. entry. catalogue, with time required to produce | үй 
graphically accurate output in thé form of catalogue 22. 
‘eards or book-form catalogues yields the rather surpris- . 


CHY-6 uses 4440. spaces P соге, ам снул, тезу. | 


must be-much .more ‘accurately designed than for the . . 1800. 


Although iie programs S been кже for : в year 


and a half, they still harbór two major bugs, each of . 


which, performs. its trouble-making about once or twice 


‘during a. week. Also, the programs produce "only, single- 
catalogue „ cards and not. continuation. cards. Debugging | 
is not, complete because of press of other. duties, while . 
` , production of continuation cards will " require the- redesign 
‘of CHY-1. 
`` Аз might be divined. from this detailed description of 


the CHY set- of programs for processing bibliographic 
data, the original conception of ‘the processing was that: 
of a series of programs, each independently loaded | 3n 


: sequence by the operator. In the process of- writing the | 


programs, it became clear that-it was feasible to. dave . 


each load its successor and reduce human intervention in | 
' processing. As presently written, the five programs used : 
for punch-card production for 870 operation employ over Ў 


19 000 ‘spaces of core but run on а 4K machine. 


Retrospection suggests that better design might "be 
‘achieved in such programs by designing them as one pro- 
gram which might be 19K or more, and then converting: 


the program into-modules for sequential processing оп а 


computer with a smaller core. Of course, total design ШАРЫ 
“could not be done independently of modular sequénces;' 


rather, the modules would have to bé roughly outlined | 


before the processing design was undertaken. No matter ` 
what the approach, the sort-control module "will be Ше. 


largest component and the most difficult to design. `* 
Experience at Yale with the programs described above 

and in designing further programs alréady mentioned Чот 

Ше production of catalogue cards and book-form cata- 


logues throughout the University ‘Library system ‘has . 
‘already yielded interesting observations. Similar -obser- 


vations have been made from the éxperierice. of others. denn 


Perhaps the most consistent and most striking observa- -' 


tion ‘is that the programming of bibliographie data with- 


- out exception has taken longér, or has required miore `- 
‘programmers, than originally estimated. Where there .;: 


have been deadlines for operational production, they. en 


have not been miet, although the FAU. printed: catalogue 


. Was very close to its deadline. At Yale there has: ‘also, 
‘been’ experience in‘writing programs that produce output 


in upper case characters only on-the 1403, as well as the 


type of upper and lower case-output described: In pro-. gt 


grams for upper case output, the same bibliographie input 7. ; 
data;is employed as used in card production, but flags, . 


and characters not on а 48-character chain are removed. 2 
or converted in the read subroutine. Ап example of such. 


output is the accessions list depicted in Fig, 6. Compari- 
son of times required to. write programs for specialized 
output in upper case, such, as the accessions list or а main 
iblio- 


ing result that it takes from 6 to 20 times as long to 


program the latter. Iri is this extraordinarily ке differ | 
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> ег chorals." 





Р Ето. 5. Catalogue cards printed two up on an ІВМ 1403 


ence in effort required that leads to deadlines not being 
ше. 

[Che question not unnaturally arises as to why it should 
take во very much longer to program bibliographie data 
processing to yield a product having bibliographie integ- 
rity. There are three major reasons: first, the biblio- 
graphic data for each title is surprisingly complex for 
a small amount of data; second, the determination of the 
location of each bibliographic entry in a large catalogue 
is la complex procedure; and third, the generalization of 
programs giving them the capability of yielding catalogue 


SELECTED LIST OF ACCESSIONS 


CHURCH MUSIC — HESSE — MAINZ 
KOLLNER, GEORGE PAUL 1902- 


DER ACCENTUS MOGUNTIUS. 


cards and book form catalogues having different, formats 
adds another highly complex factor. 

Inspection of Fig. 1 will give some appreciation of the 
complexity of data on & catalogue card, although at first 
glance the data seem quite simple. In the card shown, it 
was necessary to remove the punctuation (including the 
square brackets) to set up the sort control, and to elimi- 
nate the spaces occupied by punctuation so that the sort- 
ing field was left justified. Also, it was necessary for the 
program to recognize that the text being in German, the 
lower case “6” with the umlaut over it, requires translit- 


MAINZ 1950. HL3082 „КТТ 


Fic. 6. Sample output for accessions list from same decklet used to produce Figs. 3 and 5 
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7 „ће ‘useless to:do 80. - Та. short, it is necessary to géneralize ` В 
. the output-to somé extent- even before Ape choriak iB- 


s de 


г 


ғ, 


136. 


4... Б Т ge МР 


eration d hog» to ra ledi in ie proper. idea ‘Simi- E: 
Лапу, the program had to recognize. that “Der” ава 5 


- German article to be removed from the. sort: control that. 


` determines” the filing. ooation -óf the. title added мету: | 


- Moreover, {һе program: inust be able to. accomplish the 


- umlaut eonversion алд. ‘the ‘article. Temoval in author 2 
' |! entries ав "well as in title, short title, and series-added. 


‘entries. 


.From these шр Ыы: ала. es alréady. given, it ds 5: 


| obvious that there ів a huge, although not infinite, variety. . 
| of outputs : within categories of bibliogrsphic. data on each 
а catalogue card. The variety is во enormous that it would. .- 


‚ be impossible {о list, each output, апа moreover it woüld 


ииден, 


' The CHY programs s described wore demia to yield us 


неа catalógue сатав in packs and alphabeted within 


each pack. for filing purposes, the filing; of course, to be. 


` done by ‘humans. In the design of a book-form catalogue, 
it is necessary to set up the sort fields so that.each entry, 


;. be it author, title, short title, conventional title, series, r 
. Subject, joint author; editor, ete., will be located unam- 


“biguously i in the arrangement of the catalogue 80 the users 


` ean find à follówing: simple rules. Complexity of рго- ` >. 
“ gramming increases as the number of titles in the cata- 


` logue increases and аз the number. of types of entries for” 


E each title in the.cstalogue increases. For instance, in an’ 
author, book-form catalogue there may be as many ав 10. 


different types of entries, and a single filing sequence may 


‘involve’ up to four factors such as surname; given names, >. 

date. of. birth, and | a Ана рћгаве: deseribing the. 

zd author. | ^ - | 
In a system of” Ба io dosages, there should. ‘be’ | 


an author catalogue, a title catalogue, subject catalogue, 


7. shelf-list, and official catalogue of. which the: first three, Ex 
7 or.two of the first three, may be combined into one. ‘It is >” 
` this. type. of generalization: of output, that contributes 

"the third, „ajor factor of complexity. When these three: 


' factors converge in one program, the overall complexity 


7 expands | rapidly ; it- is this fact, more ‘than any other, -` 


' that ‘necessitates the . expenditure. “ора surprisingly -- 


“large amount . of (он to. eee оор вац data 2 


` processing. M x 


Recommendations for Е details i in program “design.” - 
appear elsewhere (5), but there are at least two more ` 
Design should 'con-.: 
~ sciously,- attempt to. reduce human participation to' the | 
„Absolute minimum necessary to avoid introduction of 
The history . of machines, and. particularly of, - 


general: observations’ to Ље made. 


" error. 


‘machine’ tools, is characterized by, “an ‘increasingly accu-. 
P fate- product, that, has miade possible assembly-line mass 


“production employing. iftterchangeable . parts. 


Some programmers have relieved the computer of pro- 


~ cessing it could do and placed. responsibility on human |, 
теш drin programing : is simplified | and 


kea T EN J 297 1966: 


Similarly, | 
_ the more complete the machining of -bibliographical rec- ` 
"7. ords, the more universal will be the use of those records.. - 


Ф Conclusion · SES 





: processing “is more “positives” bit өресіне at ‘ail. РИ 


views, , experience - confirms: inm the. map сап -be . 
trusted. - : S RE 
Examples of: activities miguel to. inn Уш бте pu 





" better handled by machine include special flags. to exclude > 


initial articles to be dropped from a:sort field; leaving a 
space to insert a nonspacing’ flag; and writing dates that 


„са. be extracted from the data. Computers are entirely 


сараЫё of.coping with such tasks and can assume the 


' burden. To insure that the machine ‘assumes as much of^ '. 
_ the burden as possible, the final -design of input. data 
` should receive an isolated, last review to. determine: -that 
. humans are doing nothing in preparation of the data tbat n 
„the computer сар до. | 


- Finally, there are, different ај би of error im input. 


' bibliographie data. Bome types of error can prevent pro- 


cessing of a decklet, and both human and machine proce- · 
dures should guard against and detect such error.- At, ihe 


other'end of the spectrum are errors that do not inter-. . 


rupt processing and which cannot be recognized in: the 
final output. Ап example of this latter. type of error 
might be a missing added-title entry that the catalogüer - 
intended to put in the catalogue. but for. which he. 


neglected to give the instruction. It is most likely that -/ 


the prevalence of.this type of error is far less than the 
incidence of varying judgments by cataloguers as to 
whether or not there should be a title added entry. Thére 
їз. по point in guarding against error of omission when ' ; 
varying judgment. yields а far г higher incidence rof. 


omission. = 2018 


Ж 


- ' 
КИ; 


` Until there is a cadre of trained. individuals: who: are ze 
equally at home in librarianship, systems analysis, . and 


: programming, it seems likely that the-enormous complexi- 


ties of bibliographic data processing will confine its, 
development to multistage advancements. In fact; the" 

advancements will probably have. occurred before “һе 
cadre is in being. The- Yale work is an example of multi- 


` stage development, there being two major steps in the 


СНУ Project which have formed the basis for a third 


. major step that will be the Yale University Library's 


System for: ‘production of catalogue cards and. book-form 
catalogues. In review, it seems most improbable that it 


. would have been.possible to go directly to the third step- : | 
апа &ccomplish the, sophisticated output now being "E 


designed into the system; initial programs elsewhére will | 
probably be subsequently elaborated. In the light of this 


.. Observation, it seems.wise at the start of a project ` for 
. bibliographie data processing to recognize the probability, . 
and perhaps even the. desirability, of & тшде ME 


development. - 
. The amount of effort required to. program for ыыю- 


graphic: data processing is so large as to generate a, 
'. pressing: need for multiuse programs to be shared by. . 
+ different institutions. It is to be hoped that-at some stage .. 





' in the development of this type of processing such multi- 
'use programs wil come into being and thereby reduce 
| programming effort. It is now obvious that although it 
‘was too much to hope for universal use of programs at 


‘the first stage, such programs should be the future 
iobjective of those working in the field. 
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Оп ће Optimum Number of Frames Per Microfiche 


It is shown that, in reducing a collection of documents: 


fo microfiche, the efficiency of the microflche-micro- 


" frame system has an optimum value. This optimum ` 


' depends on ће page distribution of the collection. 
. For example, the: optimum for the МАЗА- and DOD 
. collections is 63 frames per microfiche. The ratio of 


€ Introduction у 


In recent years, the use’ of microfiche as means of. 


. communication of information within the scientific com- 


'. munity has been well established. The National Aero- 


'nauties and. Space Administration (NASA), the Atomic 
` Energy - Commission (AEC), and the Department of 
Defense (DOD) :аге Ше foremost examples for the ‘use 
of inicrofiche on & large scale? 


'.* Microfiche are’ an economic means for the mass dis- 
- tribution. of published ` literature. This economy. extends- 


‘to essentially three parts: economy of storage, economy 
of reproduction; and.economy of transmission. Most 
recently, an effort? has been made to standardize the 
physical size of mictofiche and the reduction ratio, and 
therefore the number of frames рег, microfiche: The 


: .reduetion ratio is established by the state of the art of 
the microfilming and reproduction technique, ара it can. | 


be expected that the present 18:1 ratio will in the future 
be superseded by a higher ratio.. Thus the existence of 
an optimum way to store the documents of a collection on 

_ fiche is-an. important’ consideration in furtlier standardiza- 
tion efforts. DNE : 


| * Page Distribution in a Collection of Documents 


To find the page distribution, the approach is taken 


iss éach member of the community generating these d 


1 Present address : Dosumentation Aneorporated, 4888 Rugby Avenue, у 


Bethesda, Maryland. 
.# NABA, for example, has approximately 50, 000 documents on micro- 


fiche. 
3 СОВАТ, | 
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pages to frames is then 0.57, and the ratio of docu- ^ 
ments to microfiche 0.72. The existence of such an 


optimum ` suggests that further significant: filming ТИ 


reduction ratio should be accompanied by а corre- 
sponding reduction of microfiche size. | , 


. WOLF KUEBLER ! 


Documentation Incorporated f, 
Bethesda, Maryland | ` "Woo y 


documents is producing useful information (undéfined), 


and that to express this information, a variable number ` 
of words as.well as visual representations, such as pictures | 


. and-graphs, are necessary. To take these latter into con- 


sideration, & page will be considered as the fundamental 
quantity of information, making, no distinction in the 
subjective value of the information on. each page; each 
page has as much information content ав the next page. 

Furthermore, it is assumed that, the page size and word 

density average out over a large population. 

A collection of D Documents is composed of D, doeu- 
ments with 1 page, D, documents with 2 pages, and gen- 
erally of D, documents with p pages. The number W of 
possible arrangements these D, documents in p page 
classes is then i 

WO ET M cu 
where . Sox ` E M 

D= Ур, ы <a ores 
prs, PM . 
represents the total number of documents. 


`. W(D,) is the discrete probability distribution for the 


occurrence of D, documents with p pages. The most 
probable distribution i is calculated, by using equation (2) 
and the total пштђег ог pages 24 


р-СУрр 0 20) 


рс 


as в constraints. In place of W(Dj), one seeks the uncon- |. 
‘ditional maximum of 1n W, taking care "of the, accessory 


conditions by Lagrangian multipliers | | и 
(ато, 9 -Ур,= ој =, 0 


Using Stirling's formula for In W(D,) gives 


7 
C Dim DaD) +a DaD) +8 > pa(D») =0 (5) 
1 Р Р Р 


Thistis true for every p; therefore 


' |o InD; 4-« d 8p —0 (6) 
or 


Dp = e% (7) 
The Lagrangian multipliers are determined from the 
constraint conditions expressed by equation (2) and 
equation (3). One finds * 
е* = (ef — DD (8) 
g = — In(1— D/P) (9) 
where P/D is the average number of pages for a given 
collection. Finally, the page distribution is obtained 





Dp = (e^ — De*r-D (10) 
Or | 
D, = D... — D/P)” (11) 
P/D—I 


For further calculation, the result in the form of equation 
(10) is preferred. 

efore comparing the results of this calculation with 
actual collections, a short discussion as to the nature of 
document collections is necessary. Basically, the docu- 
ments fall into two classes: Class A consists of the “Open 
Literature”; Class B, of the “Report Literature.” Class 
A documents are journal articles, reports of symposiums, 
etc., and by their nature are relatively short, say less 
than 30 pages. Class B documents are reports not re- 
stricted in length, although they practically are usually 
less than 300 pages. 

he page distribution of Class A documents was 
obtained by a sampling of the International Aerospace 
Abstracts (АТАА) 5 The results are shown in Fig. 1 and 
in;/Table 1. The solid line represents equation (11), with 


Тін 1. The ratio Page to Document (P/D) for different 
samples. 








Sample size 
! Documents Pages 
Collectíon Туре D P P/D B 
АТАА Open Literature 
(Class A) 920 7011 7.7 014 
NASA Open Literature 
(Class А) 708 5611 8.4 0.12 
f Report Literature 
' (Olaes B) 1448 72416 50 0.02 
| 82 Frame + 
| Microfiche 
(Class B) 950 —— —— 0,0? 
58 Frame * 
Microfiche 
К (Class B) 1192  .——— ——— 0.02 
DOD Report Literature * 
! (Class B) 1271 0.02 








1 





'* The pages were not counted; the Р/р value found for the NASA 
Class B collection was вшћојев to represent the date. 
1 


4 Тһе summations are from p == 1 to р = co, The error introduced is 
negligible; in principle the calculations could apply to a finite scheme. 

5 Published by the Technical Information Service, American Institute 
of Aeronautics and Astronautics, 
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Fic, 1. Page distribution obtained by sampling Interna- 
tional Aerospace Abstracts (Class А documents, open 
literature); the solid line represents equation (10) 


P/D obtained from the sample. The interesting features 
are the low value and the overshoot on the first nine 
pages. This is accountable for by human elements, which 
were not taken into account in finding the most probable 
distribution. For example, these Class A documents 
contain the papers presented at symposiums. A certain 
time is “allowed” each speaker, resulting in papers 
clustered between 3 and 7 pages. This feature will not 
influence further calculations, as long as p2-10. In 
Fig. 3, the result is plotted in 10-page intervals; as can 
be seen, the oscillating part averages out. 

То calewlate the distribution function for intervals, let 
the document collection be subdivided into intervals of 
documents having pages (1, 2,..., а) for the first 
interval, (9+1, 4+2, . . . , 2q) for the second interval, 
and ( (0—1) q+1, (n—1) 4+2, ..., nq) for the nth 
interval. The number of documents, Пс зудк, nq in the 
nth interval is then by equation 10: 

nq 
Daa Qm = Da = р(её — 1 ва 
(n-1)4, nq pa 
= D(ef&» — 1) ата (12) 
This again is an exponential function, the primary vari- 
able being the interval number n. 

Тће page distribution of Class B documents was ob- 
tained by a sampling of Scientific and Technical Aero- 
space Reports (STAR)? and Technical Abstract Bulletins 
(TAB). The results are shown on Fig. 2 and Table 1 for 


$Published by the National Aeronautics and Space Administration. 
* Published by the Defense Documentation Center. 
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Fig. 2. Page distribution. obtained by sampling Sctenttfic 


, and Technical Aerospace Abstracts and Technical Abstract , 
* . Bulletin ‘(Class В documents, : report. literature); раве 


interval а= 10 


pYA Шері ‘terrae 8 The solid line represents СРЕ 2) | 


Й for q—10. page intervals with P/D. obtained from the 
. STAR sample. "Тћегв is again an overshoot with a peak 


at 20 pages. -In Fig. 3, the results have been plotted with `. 


30-pagé intervals; the overshoot has been smoothed out, 
-ag has the “spread” between the two collections. ^ - 
. ~ Finally in Fig. 3 and Table 1 are the results of & sam- 
gling of the same ‘Class B documents: NASA microfiche. 
There are two. sizes; previous -to standardization, the 
‘microfiche were on 5” x 8” size with 82 frames; now the 
‘microfiche are on 4” x 6” size with 58 frames. 
Е . "The results of this part are necessary in order to caleu- 
late an-efficient microfiche size for documents оға given 
collection: 


€ Efficiency of a Document-microfiche Collection 


.. "The interval q is now identified with the number of 
, frames in a microfiche. In the a шга, there are 
according to equation (12), Розаға, aq’ documents; 
- consequently, there are DD ъ-1заға, па Microfiche М, for 
“+ this interval: Е ` 


Ma = = DD (o-r)q43, РА == D (ee - 


, 


Dne* (13) 
The total number of microfiche M is therefore 


"t - " сің 
M = 2M. zz (е — 1) а = e^) (14) | 


` This was в selectivo sampling. Whereas in the sampling of АТАА, 


` - each item was used, for STAR and TAB only the “open literature" items 
+ ^ were considered. For STAR also the Class A items were considered; the .- 
` result is listed in Table I. Тһе sampling is represented in 10-page 


intervals; otherwise, to get а distribution, s a simple about 10 times the 


, Біге Would havè been necessary. · 
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q — 58 Pages 
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` Fia, 3. ‘Page distribution obtained by sampling microfiche 
(а= 58 and q— 82 pages); included are the data of 
' Figs. 1 and 2 for different page intervals · ' ` 


` Fig. 4 shows the relative distribution of microfiche 


_ Mn 
сом 
for differént values of Ва. Equation (15) a а, maximum 
for пва =1. As long as à given document collection fulfills 
the inequality | = | 
ва=1 o. 2% E 


one is assured that most documents are оп one microfiche. . 
For Ва=1 approximately 6395 of the documenta. аге on · 
one microfiche, representing approximately 40% ‘of the. 


total number of microfiche. 


` The efficiency of the document-microfiche system con- 
sists of two aspects. First there is the efficiency ту in 
terms of the ratio of total number of microfiche to the 
total number of documents? 
M и ЈЕ an 
In terms of the number of microfiche, the most efficient - 
system (тм--1) is obtained if Aq. Secondly, the 


* For simplicity, it ig assumed that q>1; ie, the microfiche has aay: 
more than q == 10 pages. Thus equation (14) reduces to 25 
3 D PE : E 


iw TT 


L (1 — елемеген | ав p 
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; 1 2 3 4 5 6 
| INTERVAL n 
Fic. 4. Relative distribution of microfiche for different Ва 


efficiency np -of the ratio of total number of pages Р to 
tdtal number of frames qM is given by 
| P P 1 1—e^ 


| vee GM DP 74 2 
‘If D/P <1, then by equation (9) 
| D/P =£ (19) 
and 
— p 
nr = UE (20) 





In terms of the number of frames, the most efficient 
system (пр=1) is obtained for Ва 0. The overall effi- 
ciency is 

| 7 = Mer 

| _ ü-— ety 

7 £a 

The factor Х=2.46 is introduced to normalize the 
overall efficiency, the most efficient microfiche-frame 
system has the efficiency one. Equation (21) is shown on 
Fig. 5. It has a maximum for 


"а + 2Bq)e** —1 (22) 


(21) 















































Ета. 5. The dependence of the efficiency on Ва; for refer- 
ence, the efficiency in function of the frame number for 
various В is also shown 


or 
Аша = 1.26/8 = 126P/D (23) 
For practical applications, the maximum is relatively 
flat, for 0.7« 8q«2.1 the efficiency is greater than 0.9. 
Fig. 5 shows also frame scales а for various values of 8. 
In applying this result to the NASA-DOD collection of 
Class B documents— open literature — one finds for 
8=0.02 the most efficient microfiche size is obtained for 
q=63 frames. The present standardization is for q— 60; 
therefore within practical limits the optimum 15 well 
achieved. Further significant advances in the state of the 
art in reduction (ie., by а factor of 2) must be followed 
by a proportionate reduction in frames per microfiche. 
For the "older" microfiche sizes of 84 frames, the effi- 
ciency is 0.96. If European standardization of q—306 had 
been adopted, the efficiency would be 0.9. For type А 
documents, the present 60 frame size is uneconomical; 
for example, the efficiency for the AIAA collection is 0.29. 
From equation (17) the optimum ratio of documents 
to microfiche is 0.72 (for Bq=1.26). Similarly, from 
equation (20) the optimum ratio of pages to frame 
becomes 0.57. 
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qe 2% 


“There is’ no n һу more бошу identified with 
research into the idea of relevance as operative within 


- the field of: documentation than is Donald: J. Hillman: It 

' -is not certain that he seés any real hope for arrival at any o. 

: .8Sort of technique capable | of producing document-sur- 

"" rogates -relevant to réal queries — thus excepting those, ' 
‘that satisfy queries derived from “source-documents.”.- 
Yet, in his paper “Тһе Notion of Relevance (1)” (1) he `` 
‚ cites, presumably. because it is relevant tó his dwn discus- ` 


-sion, а paper by В. С. Vickery, “The Structure of Re- 


^. more; how;can а technique be devised capable of mechani- . 
- · eally tracking down Vickery's.paper as а result of the . 
. query, “What is available on relevance?" 
Hillman puts the idea of relevance into question, but it. . 
'. seems to me that even sò, no опе would.be' unable to say. 


| | 186 ^ American Documentation — July, 1968, 


‚ trieval Systems” (2). No classifier, surely — probably not 
even an indexer — would have thoüght of using any ‘such - 
‘ conceptual indicator as "relevance" for, Vickery’s paper. = 

` If (and it is really conditional) we agree with the pro- 
ponents. of. citation-indices, it. must be. that Vickery's' 


paper is relevant to Hillman's.topic. But why? Even 


quite & bit about the idea, or at least to decide whether 


| “document 2 is relevant to query A.” The term itself has 


become fashionably popular since Cleverdon’s. test-results 


were phrased in such a way as to emphasize relevance and . 
recall above all other possible criteria of the evaluation of ` 
.. retrieval tests(3). Yet it cannot be evaded that some Е 
- . things that someone thought (when he indexed a partiéu- 


lar document) were relevant to certain topics were (from 
ihe querist's point of view) not 80; 


would (presumably, if he were able to get at them) call 
relevant, but which the indexer-did not, are missed, lower- 


um ‘ing the recall ratio. This disparity could also be called 


(in conformity with the same fashion) silence. ` 


| 1 Head, Information Retrieval ` Division, ad ње Professor ot. 


Philosophy, Floride Atlantic Unirersity, Boca Raton; Florida. 
*'Or which were retrieved: because of faulty купдеНс trails. ` 
3 The fact that, in the “real world" of patrons and reference-librarians, 


· the condition is that the querist cannot get at all the relevant documents ' 


^ Florida Atlantic University 


: 2 jt ів this disparity” 
. that lowers the relevance ratio. Such. disparity is also 
.ealled (in conformity with another. fashion, ‘that of “in- 

' “formation theory") noise. Also, some things the querist ``- 


`. -quite ineffable: 
- no standard againn which он judgments can. be шешигей. йр 


к 
‹ 


P Documenta ry Releva nce а па Structu ral Hierarchy 


JEAN M. PERREAULT : 
Boca Raton, Florida. 
“But another factor enters, “which most reference ' 


librarians have encountered many times: .purely concep- 
tual relevance (match, fit. between query and document: 


orientation) is not enough for a patron in the real world, 


even if it may be good enough in a test situation. Even: 
if all+only (conceptually) relevant documents: are re- · 


7 trievéd, they шау not each be of use to the querist; 


formal and/or nominal considerations (4) may make even 
what is conceptually. relevant inappropriate: toa patticu-: 
lar query. Pos 
One characteristic ‘of relevance, da ав anyone could. 
have seen, is that a judgment on 16 is a 'value-judginent. * 
‘In both cases of disparity noted, leading either to noise 


‚ог to silence — or in cases of formal or nominal i inappro- 


priateness — there i is & sort of block involved which the | 
temporal nature of storage and retrieval brings with it, 
since it is describable as “long-duration information- 
handling" or аз в. “delayed message<center.” A situation ' 
could, of course, be imagined where the creator of surro- 
gates was in possession of all the queries that were to be 


` allowed, before analysis of the corpus; it would now ће 


unnecessary to create. surrogates, but instead only to`. 
check each document for relevance to.each query. But 


‘even in this situation the querist, having delegated. the 


relevance-judgments, could be disappointed by the. Fults 
if he too had access to the corpus. Р 


Тһе future query,. thoügh, in the normal situation, can- p : 


‘not be anticipated in its own concreteness (as the past/- . 
' present one can be), so the surrogate-creator attempts:to-. 


indieate instead the 'orientation(s) ој the. document. 


` "Conereteness" in this context is a bit of а technical term, У 
signifying the philosophical concept that an entity is ac~ 


tual when it has réceived ("is cóncretized out of") all" ita : 
formal and material perfections — or, that an essence or 





unless the куйып revealg them to him, and that therefore’ the а legit- , s 
imate technique for evaluation of performance ig reading, by the queriat, 
through the whole corpus to determine which (additional) relevant docu- 


~ _ ments ought to have been supplied, and which supplied documents ought 


to- have -been omitted, effectively -vitiates all 'eritiques ot the cud 


| document-based “artificial” questions used by Oleverdon. 


4 Of. the work cited-in (8) fora statement that, while ығышу dude | 
ting relevance among values, tends to make it so subjective aa: to be 
"Relevance cannot function as а criterion since there is 


nature js made up of several intersecting or overlaid 
formalities. From this point of view, the concreteness of 
a document that could be discretely coded as A, B, C, D, 
could be exemplified (in particular by the punctuation) 
thus: A:({B+C](D)), or the like. The concreteness of 
the query is postulated as unlikely ever to match such a 
notation, but instead to be В(Е) +F, or A,:C(8), or the 
like. There are quasi-intersections between individual 
(discrete) formalities in each of these configurations, such 
(perhaps) as to make the document relevant to the 
queres | 

Тће desideratum in mechanized retrieval is this, that 
the "perhaps" be validly removable; that, in other words, 
formal conformity be equivalent to material relevance 
(and hopefully, by the aid of formal and nominal ele- 
ments in the document-surrogate, appropriateness as 

well); and, in still other words, that the only operation 
required of the mechanism be a clerical one. But, as 
noted above, the real problem is not “that” or “whether,” 
but “how,” 

If, in our example, À, signifies a species of the genus À, 
what are the conditions under which the document 
A:({B+C](D)) would be relevant to the query A, 
(ignoring the rest of the original query-example for the 
moment)? It is simply this, stated as a binding con- 
vention: 

The concepts A,, А,, . . . A, are so named because of 

the normality of their treatment together ( — A). 

Where thus treated together, À is the indicator. Where 

treated in only partial togetherness, А is no longer 


tótally appropriate, and must be omitted in favor of 
those of its species actually present. 


‚ For|instanee, if in the upper genus "dogs" there is a 
division by size, at least one middle genus will result as 
"mihiature dogs" — besides other such middle genera. 1f 
the imiddle genus "miniature dogs" is itself divided into 
Pekinese, Chihuahua, e£c., it can itself be validly applied 
only. when all the named species are present in the docu- 
ment. The general principle is thus: 


No genus may be pre-scribed as including a set of 
species without literary warrant for normality of such 
conjunction; no species may be pre-scribed as included 
тоа genus except for converse reasons. Nor may any 
genus be in-scribed unless the document so inscribed 
treats the various species pre-scribed to that genus; 
nor, mutatis mutandi, any species. 


Hence, if our example reads such that A is “miniature 
dogs" and A, is Pekinese, our convention indeed insures 
that А: ([B--C](D)) is relevant to A,. But, probably, 
other}formalities as in our example —A,:C(8) — will be 
preset in each query. Postulating that the document- 


1 


3 The, difference between this situation (storage for future retrieval) 
and ihe hypothetical one outlined where there is no storage, is, besides 
the lach of futurity-based modes of effort, a wholly different attitude of 
analysis, depending on the possession or lack of the queries that are to 
be satisfied; what the proponents of naturnl-language whole-text “те- 
irleval" are really striving for is removal of the “‘re-’’ by putting the 
computer in the place of the non-surrogate-creating analyst. Again, the 
lack of ‘(ut least some future) queries is what leads to the creation of 
surrogates, and is what forces analysis according to all that the analyst 
вап have at hand: the document itself, 


surrogate can be translated as “influence of cold (B) and 
insufficient nutrition (C) as characteristic of high alti- 
tudes (D) upon miniature dogs (A)”; and postulating the 
query to be translatable as "influence of insufficient nutri- 
tion (C) as characteristic of abnormal environments 
{8 — the upper genus of D) upon Pekinese (A,),” it can 
be seen that a purely clerical operation of the mechanism 
will produce the document in response to the query, and 
that its relevance is guaranteed by the convention. 

Unfortunately, this solution extends no further than 
what can be called "explicit relevance." Such a case as 
was the starting point of this paper may not be included 
in it. But we do catch a glimpse of another characteristic 
of relevance, that a judgment on it is a hierarchic value- 
judgment. However, another look at what underlies the 
convention may furnish additional clues for a fuller solu- 
tion. It has been proposed that classification is not neces- 
sarily hierarchical (5). I cannot agree that it is a legiti- 
mate use of the word “classification” to apply it to a mode 
of document-surrogation which does not have the char- 
acteristic “hierarchical.” This presents the problem of an 
explicit distinction between (hierarchical) classification 
and those modes of document-surrogation which are non- 
classificatory. These other modes I would characterize as 
“indexing” — in any case, a clearer distinction has been 
needed for some time (6). ' 

Indexing itself сап be of at least two types, depending 
on whether it is controlled or uncontrolled in its vocabu- 
lary; this distinetion could be put also as "whether its 
vocabulary is external or internal to the document(s)." 
If controlled, it a-scribes concepts to a document; if 
uncontrolled, it de-scribes the document by the words in 
it. Thus the use of controlled subject-headings,® just as 
much as the setting-up of a concordance, is indexing; 
they are both (-)scription of elements of the documents. 

Classification, on the other hand, is not concerned with 
the detection of concepts or words within documents, at 
least not as a final purpose. It is concerned instead with 
pre-seription of positions within & systematic conceptual 
organization, and with in-scription of the documents to 
such positions and of such position-indications to docu- 
ments and surrogates. Thus classification, too, is (-)serip- 
tion, not of the elements of documents, but of documents 
as wholes and of the corpus as a whole. This may connote 
the marshalling function of traditional library cataloging/ 
classification, or, on the other hand, the concrete pre- 
scription (recall concreteness as defined by the intersec- 
tion of formalities) of articles in the Revue Internationale 
de la Documentation, Referativnyi Zhurnal, etc., to com- 
plex UDC numbers. 

The systematic-conceptual-organizational aspect of 
classification is also characteristic, to some extent of con- 
trolled thesaural subject-headings, where, if fully graphed 
out (7), the syndetic aspects would result in a systematic 


51n the sense that they are controlled, what Moders calls 'descrip- 
tors’ must be re-named ‘ascriptors,’ rince if controlled they are not taken 
from (‘de-’) the document, but are ascribed ѓо (‘a-’) it. 
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conceptual organization, however inadequate or difficult 
to follow through. Itis not classification, though, in that 
it avoids going beyond the discrete level of indication of 
topical orientation — except insofar as the idea of ийе 


7. сапа links has been.adopted by indexers. 


Diagrammatieally, the new terminology would fit to- 
` gether as in Fig. 1. 


CLASSIFICATION 


pre-scription to one 
clase among many | 


INDEXING 


a-scription (in 
terms of concepts) (ап terae of 


| ^ words 
thesaural free 
iu 4 


` Fre. 1 


de-scription 
¥ 





j{n-scription of class 
docunents and/or surrogates 


[As a note to the idea of classification as necessarily 
hierarchical, J will not deny that, even within a hierarchi- 
cal classification, classes do not stand in a hierarchical 
relation to every other class. Thus, in such a schema as in 
- Fig. 2 the blocked-in, classes аге not. among themselves 

hierarchically arranged. But, just insofar as these nota- 
tions are conceptually filled and are used in a verbal 
system (if, then, a=fishes, D=felines, F,=Pekinese), 
there are implicit (systematic [8]) relations between 
them which constituté hierarchicality as soon as the 
remaining terms are used (if x=animals, 8=mammals, 
" F= canines, ete.).] i | 





oor oS 
A B C р) Bo dS! 
етст 
n 
Fic. 2 


Going back to the beginning, it сап be said that rele- 
vance.is a sort of double negative. I might very well be 
able to find а Biblical passage relevant to this paper, just 
as Hillman found a passage in. Vickery; that is, without 
‘the thematic orientation of the cited bibliographic entity 
being such as to indicate the relevance. The Bible is 


further off, granted; but that is not/real why I didn't | 


go to it in search. . . . I haven't done so, not because it’s 
not (possibly) relevant, nor because it’s not particularly 
appropriate, but most of all because there is a fairly 
plentiful supply of more clearly relevant material nearer 
by — mostly because of the work of Hillman himself. 
Hillman, on the other hand, desirous of locating a relevant 
source by а documentalist (rather than more citations 
from Carnap, Goodman, or other philosophers — note 
the need for appropriate relevant documents), looked 
for а double negative: а document on documentation 
that was not relevant to relevance. 

Again, though, how ean sueh а process become purely 
clerical, in order to be fit for the delicate constitution of 
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the computer? What good (in line with my own predilec- 
tions) сап hierarehy be for such a need? 

Utilizing а hierarchicality that is structural. notationally 
as well as conceptually, the mechanism can easily find the 
relevant document for the Pekinese query (the subscript 
“1” is analogous to а decimal structurality) — of course 
assuming the binding force of our convention. This can 
be reinforced all the more if the document-surrogate, in 
place of a merely conjunctive colon ( = “mutual influence 
of some sort"), uses а more precise relational term (e.g. 
“is destroyed by”), and if the query is in such terms as 
(instead of its colon) “is injured by”; in a system where 
not only substantives but relatives are notated structural- 
hierarchically (9) such a query could likewise be satisfied 
by the document-surrogate mentioned, because of a 
subordinate relation between the query’s relational term 
and that of the surrogate; besides. helping prevent this 
document’s becoming noise for a query where the relation 
is strongly variant from that mentioned, but the substan- 
tives remain the same. 

The necessary factual condition that would now lead 
to the discovery of the Vickery document because of the 
insertion of a query phrased however Hillman did phrase 
it, is this: there is a particular system of document- 


` gurrogates organized conceptually into substantive classes 


and subclasses; capable of being mechanicoclerically 
manipulated because-of the structurality of its notation; : 
and capable of each part within it becoming modulated by 


„апу other part, in accordance with a relational schema 


similarly structural, hierarchical, and general-categoric — ` 
such that the terms of the query are related to the terms 
of the document-surrogate by pre-established paths. ' 

- Such a condition might, in this case, be met by classifi- 
cation “X,” but not by "Y";? yet tomorrow there might 
be another query for which ^Y" would produce far 
superior results. It is impossible, therefore, to say that | 
any particular substantive organization (for, since the 
other .characteristics should be made to obtain in any 
classification intended for use with the computer, this is 
what distinguishes one such system from another) is 


always best (LC? UDC? CC? etc.); from this many 


have validly but wrongly 8 concluded that classification — 
indeed; the creation of any kind of surrogate — is inade- 
quate as the basis for mechanized retrieval, and that some 
means of.automatic extrapolation from the full text of the. 
author’s own verbalization of his thoughts is necessary? 
But it can be fairly conclusively shown by the testimony 
of those favorable to such а technique (10) that even 
such automatic means of creation of document-surrogates 
is far more successful in regard to classifying than in 
regard to’indexing. And we must take our pick — there 
are no other available modes of conceptual bibliography. 


7A substantive: factor that would ‘prevent retrieval of the- -— 
document for the Pekinese question would be,.for instance, the non- 
inclusion of high altitudes (D) in abnormal environment (8). 

9 As E. H. Gilson puta it, “іп во far as logic ів concerned, опе may be 
faultlesaly wrong as well as faultlessly right," Being and Some Philoso- 
phers (Toronto, Pontifical Institute of Mediaeval Studies, 1049), p. 100. 

9 Cf. footnote 4. 


So, our previous criteria still stand. We see that even 4 
talways do for our querists. what they would like (though, 


here as otherwhere, silence is golden in comparison to 
"noise: if the querist doesn't know the answer already — 


| fects of the system as much as will telling him something 





ides the creation of such a systematic conceptual 
organization? Does factor-analysis result in invariably 
reliable structures — or does associative mapping? Note 
at once the essential point: if one technique of automatic 
, | prescription is indeed invariant (so that queries can- Бе 


for.another technique unless the second results in advan- 
tages.not found in the first — even if omitting some of the 
| first’s advantages. 
The task ahead, then, is to classify аайын, both 
"manual" and automatic, (1) in accordance with their 
degree of satisfaction of our criteria, (2) in accordance 


"| they are based, and (3) [which may be but another way 
| of putting (2)] in accordance with their etiuda toward 
and provision . for relevances. 
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~ Mechanical Translation by Coordinate | 


^, Indexing ` NE 


` We ‘certainly all realize that the Тайпак: barrier 18`а 
` serious hindrance- in documentation · work, and ‘especially . 


80 for large-scale dissemination of information in science and 
technology. Maybe one day we shall live.tó see the auto- 


: matic translation of natural languages -come through as & 


working tool in documentation work, but so far we have . 
to. accept that- semantic and syntactic problems involved - 
. in a full. translation are stil too difficult to be handled 
: by documentation centers. : 
_. For this reason it might perhaps be of some’ interest to'the. 
‘readers of American Documentation to see some preliminary 
*, results from an effort to make a short-cut жа 

' „са1 translation through the use of coordinate indexing. As | 
‘described in an article in Rev: Int. Doc.; Vol. 32, No. 8, 


mechani- 


. 1965, we are basing the information activities in our own 


organization on the use of a permutation index with’ pre-- 


coordinated index terms, 


А simple word-for-word translation involvés, of course, no, ~ 


serious problems with a ‘computer. With a precoordinated 


-permutation index, however, the problems are indeed some- 


1 


- MO 


; what complicated, "because · the index terms have to be kept 


MAGNETFELT ` 


at 


\ 


‚їп their сб ао, ind i are finally’ pens icm. 


as an alphabetically arranged permutation index. In the 
UNIVAC 1107 which is used for the processing of our in- 


'dexes, the programme now includes such automatic trans- . 


lations of the ‘permutation index, through simultaneous 


| ‚ use of bilingual vocabularies read onto magnetic tapes. With. 


&utomatié translation for coordinate indexing, semantic and: 


syntactic problems are no longer of serious importance. 


Semantics are taken care of ;through the controlled vocabu- 
lary or Thesaurus, and, apart from the use of index aids 


such as roles and. links, there, is no strong need for express 


ing syntaxes between the index terms to form a meaningful: 
the 


"Твепђепсе. - 


. Problems; have, however, Bean: encoimtered i in-findi 


Е exact translation of. each of the index terms {ог the basic 
‘bilingual vocabularies, becaüse of lack of congruence be- ` 
. tween the different languages. What we are aiming at im 


this case, however, is not an exact and lexically correct. 
word translation,’ ‘but merely в translation into & concept. 


'' which, does convey. to the reader the same basic idea: For’ 
'. this reason the bilingual vocabularies used ' are nonreversible, 
being used one way only. ` 


The illustrations show sóme samples fram the Norwegian, 
English, German, and Russian versions of this mechanically- 
translated ‘permutation index for а collection of welding ; 


n 


LYSBUE. 


t'CLEBIROVANAYA STAL 


Мо. 
ME 


. Fic. 4. Russian T 
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, 


SVEISESONE 


5.65 08833: 


‘ 


rt 


2 DEFORMERING- қ ; 
. LYSBUEDANNELSE STABILITET | MAGNETFELT BUESVEISING 66 04433 
г MAGNESIUM REPARASJONSSVEISING А FLYMOTOR _ SPREKK '66 02440 | 
' MAGNESIUM -STSPEGODS . t. s TIG-SVEISING: VARMEBEHANDL ING .- 66 02440." 
` MAGNETFELT BUESVEISING LYS3UEDANNELSE - STABILITET 66 05433: 
'. _ MAGNETFELT, . DEFORMERING 4 ` SVEISESONE LYSBUE 66 04433 " 
| | Fia. 1. Norwegian | 
k E | | % : hae US | 
` ^ LICHTBOGEN р `5©НЯЕЁ15520НМЕ |  DEFORMIERUNG MAGNETFELO 66. 04433 
" LICHTBOGENBILDUNG | .MAGNETFELD ` LICHTBOGENSCHWEISSEN STÁBILITAET 66 04433 ` 
LICHTBOGENSCHWEISSEN ALUMINIUM-LEGIERUNG : MANGANLEGIERUNG TITANLEGIERUNG , 66 04434 
LICHTBOGENSCHWEISSEN . EINSEITIG 2 : PUNKTSCHWEISSUNG BLECH. 66 02442 
‚. LICHTBOGENSCHWEISSEN ‘FLUSSMITTEL: . OXYDIERUNG THERMODYNAMIK ; 66 04439 
+ LICHTBOGENSCHWEISSEN SCHWEISSAUSRUESTUNG AUTOGENSCHWEISSUNG KOMBINATION Í 66 02443 
i Ела. 2. Germa > 246 om 
ж. | Я ; г 
‘MAGNETIC FIELD ARC WELDING "STABILITY . ARC-FORMATION 66 04433 
MAGNETIC FIELO “ELECTRIC АНС ! WELDING ZONE DEFORMATION ` 66 04433, 
MAGNETIC FIELD STABILIZATION MOTION . ELECTRIC ARC `> 66 04436- 
<. MAINTENANCE WELDING EQUIPMENT ` < INSTRUCTION CARBON DIOXIDE WELDING 66 02439 
`.-- MANGANESE ALLOY STITANIUM ALLOY ` ARC WELDING ALUMINUM ALLOY ; 66 04834 ; 
MANGANESE STÉEL . WELDABILITY . | STEEL ALLOY: МИНАМА t 66 02438 © 
| Ела. 3. English n | 
: 322 ES : EN "e - t4 - EE. 
‚ KORPUS SUDNA. VALIKOVYI MATERIAL REMONTNÀYA, “SVARKA. LITIYO | 66 04426 
~» KOVKAYA STAL KHROMOVAYA STAL PREDEL PROCHNOSTI SVYAZYVAYUSHCH. SVARKA 66 04825. 
'KRUGLAYA SVARKA SVAROCHNAYA MASHINA _· ‚50509 BAK - OVUSTORONNII  . 66 02434 : 
KUZOV MOTORNYI PRIVOD' OBRABOTKA LISTOV. ZHELe KOLESNOI ELEKTROD 66 ен 
LEGIROVANAYA STAL - .PREDEL PROCHNOSTI MARGANTSEVAYA STAL >! 'SVARIVAYEMOST · = - 66 02838 ' 
_ ZONA'SVARKI- ~. STRUKTÚRA METALLA . TYOPLOCHUVSTVITELNOST 66 04419" 


documents. Аз one of the first results these experiments 
Have clearly demonstrated the importance of an exact index- 
ing from the side of the indexer, and full stringency in the 
seleetion of index terms. The Russian version of the index 
chn now be printed also with cyrillics through the use of 
Siemens’ teletypewriter T type 100; In this case the bi- 
lingual vocabulary contains the necessary code for cyrillic 
letters, and output from the computer is a 5-channel paper 
tape {о operate the telex. 


W. Horsr, Adm. Director 

Norwegian Industries Development 
Association 

Forskningsveien 1 

| Oslo 3, Norway 
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| Users Versus Documents 


lor the first time in these pages (or in any pages as far 
as I know), Mantell 1 has provided reasonable and justi- 
fiable estimates of the amount of technical papers being 
produced. It would appear, from these estimates, that 
many of us have been crying “wolf.” The “information 
"explosion" is seen to be a rather small bang. Lest anyone 
be lulled into a false sense of insecurity by the loss of our 
most quoted “statistic,” let me reiterate a point often made 
before,?3 namely, that our problems stem not simply from 
documents or users, but rather from the interactions between 
thém. 'Thus the size (or rate of increase) of our problem as 
distinguished, say, from that of the publishers or universities 
must be reckoned in terms of the product of users and 
documenis. 

The accompanying graph (Fig. 1), using Mantell’s (i.e. 
МГ) figures (and the most conservative for 1970), shows 


the anticipated growth of scientists and engineers (5 + Е),: 


i.e.. users (0), documents (D), and their interaction. (UD). 
The latter has been normalized to fit the scale of the graph. 

The information specialists job, after all, is to match 
users to documents and vice versa. UD represents all pos- 
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sible matches that one might examine. In a linearly 
organized file on a serial access machine * or in most SDI 
systems,*58 all of UD must be examined, Humans, luckily, 
have content-addressable memories. Depending on the efi- 
ciency of the matching algorithm that they use, the propor- 
tion of UD that must be examined may be significantly 
reduced. This is the best argument I know for the need for 
random-access, content-addressable memory for information 
retrieval systems. 
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Reversible Püsctuston Russian IM 


‘Transliteration. . 


Іш processing Rússian scientific papers, we od io терго- | 


„ duce the original spelling of English names which have been 
represented by Russian spellings. At present, Wwe can't be- 
' sure that our English respelling is accurate when, as peri- 


E: · odically. occurs, our only clue 18 the Russian spelling оға: 
es satisfactory Romo to- English - 


name originally in English 
‘There’ are reasona 
transliteration systems. There is no accurate · English to 
` Russian counterpart. What does exist is a subjective ap- 
' proach based on Russian assumptions as to how the English 
‚ name is pronounced, which is inadequate. even when the 


* pronunciation assumptions аге correct. . 
"> Since there are 7 English letter concepts (of the 26), which 


- have no single transliteration equivalent among the 33 Rus- 
_ sian letter concepts, some substantial innovation is clearly 
.' required if we are to achieve reversible processing of the 40 
7 different letter-concepts in the combined Russian-English 

alphabet. 
` .While there are some "sounds in English not, found in. 
Russian, and some sounds in Russian not found in. English, 


'- -the real transliteration. problem arises from the differing 


spirits of the two alphabets. Russian is not always phoneti- 
cally unambiguous, but its users try to be. Users of English 


' + are content to rely on groups of letters and associations of 


sounds. for hints as to pronunciation. 

22 "The very valuable discussion of Russian transliteration in 
Science (1-3) made some of these points while leaving others 
insufficiently clear and explicit. 

As other comments have suggested more or less explicitly 
(4-6), the, real problem із getting the Soviet Russians to 


i Я adopt а. transliteration of the English or Latin alphabet into 
Russian which, in general, will be reversible in that it per- ; 


mits the original English or Latin alphabet spelling of a 
name to, be reproduced automatically. To solve our prob- 


7 „Лет, we need to persuade the Russians to abandon a 


phonetic transcription approach for indicating English-type 
-alphabet names in scientific contexts.. 
Of the 40 different letter concepts in the iwe languages, 19 
(see Fig. 1, B'and C) already have single letter symbol 
equivalents i in both languages for transliteration from Rus- 
- вал into English. Two Russian letter-concepts, the hard 
7 -sign (or, in Bulgarian, the neutral vowel), and the soft sign, 


КЕ which gives à y sound to preceding consonante and: follow- 


ing vowels, are already customarily represented in English 

. by the punctuation symbols " lauctation mark) and 

. (apostrophe). ` (Strictly speaking, the 
apostrophe -used as punctuation should 

. included in transliterated material by one space from what 
.it modifies, and by two spaces from words not modified. 
“Note. that Cymilic_[Russian-type alphabet] apostrophe is 


uotation mark ог, 
e separated when- 


- , transliterated into Russian as N° followed by soft sign, and 


into English as ;’ [semicolon apostrophe] [cf. 71.). 


Three Russian letter-concepts are customarily represented : 


by the English two-letter combinations zh, kh, and sh, with- 


out ligatures or diacritical marks or accents. (For complete : 


:  unambiguity in automatic machine iransliteration, z:h, k:h, 
and s:h might be desirable.) 
- Before we consider the. other 16 Russian-English letter- 
concepts, we must comment rather firmly on such remarks 
-ав “We may simply ask our printer to supply the non-avail- 
-able letter [i with а cup over it]” (2, pp. 485-486): The 
controlling factor is not what we сап persuade some printers 
to do at extra expense, but what the average English or 
Russian typewriter will do, as is. 
For at least 3 Russian-English letter-concepta( t:s, sh:ch,, 


| itt and the Russian transliteration of English x as k:s), we need : 


а ligature symbol to indicate that two or more symbols in 
'. one language répresent a single symbol in the other. The 

: dash (hyphen) or .absencé of & mid-letter-level dot are 
inadvisable for this. Since the colon between two. letters 


' . visually suggests letter :ligature, and since the colon is never 


followed directly: by. &-letter when used ав punctuation, it is 
used as the Pe sign of ligature. 
In general, we use preceding , (comma) for the soit sign 


от any cup or v mark over a letier, and as а sign of ввріга-, БІ 
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' eovery.of the exact original 
, passed through а letter-substitution “mill” >(16) must also 


the original letter symbol had àn:extender: 


5 Уа use the Russian equivalent of: М to ТЕТИК 
^ English y into Russian and yh and eh for the English.. 
transliterations of the Russian "hard" i and "hard" e lettei- — . 
concepts. We can then adopt the current Russian to English" " 
Ген: usage of yu and ув ог the last two Russian 
etters : 


A general transliteration. use of the preceding English 


‘semicolon and Russian N° (number) sign is to indicate that 
the original symbol for a transliterated, Cyrillic letter con- 


cept is visually identical with a letter in the English alpha- 


bet. The combination ;ch to indicate the Russian letter-con- . 


cept pronounced t:sh 18 в special use of'the semicolon in 


: English. . 


aen Біз ow 


- tion; two b dots (..у preceding for & аі. or aant over: a 
a letter; and following slant or virgule (/) to indicate that : 


or complete unambiguity in automatic. machine trans- ' | 


literation, y :h, e:h, y :u, and y:a can be used. 


The use of "ће "Russian equivalent of.,v to transliterate Ў 
з English w is by analo 


with the Belorussian letter ju. We 
use a ligated doubled letter as in the Russian equivalent of 
k:k for English q, 2 ара to represent the class of letters 
to which q origin 


It is hoped that the 

“second order extender” aspect. of the Russian. equivalent: of 
t:s/ may' suggest the intrinsic ambiguity of the English 
letter-concept c to Russian users. 


In maneliteratine 1 , however, if was felt that the Russian. Е 
/ or zh/ ог У would fail to. convey Ше ' 
- intrinsic ambiguity of the letter-concept j, and the Russian 


equivalents of d: 


equivalent of d: xf was consequently used to transliterate 
English j. 


The 14 ` modifications in present: Russian to English’ Tr 


iteration practice (including yh and eh, but hot the: current ` o 


" and ^) are summarized in Fig. I, A. 


A good system should entail as few: ‘changes : ав possible i in 5 
alphabetization of Russian names and. titles in: 


the'existin 
English. e proposed punctuation transliteration system 
requires only two changes (eh and yh from e and from y or 


i) in the English alps abetization of transliterated Russian $ 


names now used by Biological Abstracts, BSI and ASC 739; 


Chemical Abstracts, Applied: Mechanics Reviews, and Sci. . 


“ence Abstracts. The British Museum would also use ” and 


o: i instead of omitting a transliteration for these >, 
etters, and would use,yh for ui. The USBGN would use ji. ' 


instead of y апа initial e instead of initial ye (8-14). "The 
Library of Congress now differs from all these systems in 
using iu and ів for yu and ya (15). ; 

For some applications, it is desirable to assign а definite 
alphabetical order, position to most of the punctuation and 
Arabic numeral symbols. This is done in Fig. 1. 


y belonged. In some Cyrillic alphabets, 
a form which would be transliterated as the Russian equiva-. 
lent of dd is used to represent а characteristic. sound or. 


English : 
.phonetically ambiguus English . is И с, - 
which we transliterate by the Russian equivalent of. t: ж/ » 
'shares-one of its several sound values with the unam-'. 
! biguous Russian letter-concept 5:8. 


The.lack of a transliteration system guaranteeing the res 


spelling of a name after it. has 


cause Soviet citizens inconvenience in connection: with non- 


Russian Soviet, and other Cyrillic names entering Russian _. 
- directly, and Latin alphabet Slavic names entering Russian 


via English. 


Outside the ‘Soviet Union, the Serbs and Mausédonicus. 


(Yugoslavia), the Bulgarians, and the non-Slavic Mon- 
golians use modified Cyrillic alphabets. Inside the .Soviet 


Union, modified Cyrillic alphabets are used by the Ukrai- - 2 


nians and Belorussians, and by about 52 non-Slavie Soviet 


- nationalities, The Cyrillic alphabets: other than Russian 


include’ about 41 additional Cyrillic letter-concepts,' 14 
duplicates which now ‘have to be. transliterated separately’ if 
the original text, is to be reproduced y, and 4 am- 
biguous symbols at least one use of which should be 


éhiminated, but which now would have to be transliterated. 
- separately. (Fig.1, D:) (17-21.) 


The East Armeniàng i in the Soviet "Union (and: the West 


` Armenians elsewhere) have an alphabet of their own, as-do 
the Soviet Georgians, and- those dene in the FUP Union. а 
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А. Fourteen E A H Y IH H э lU X/ Jla/ IK ,B KC р/ Armenian Rus. Eng. || Armenian Rus. Rng.) 
modifications: 2.6 ,1  t:B joh sh:;ch yh eh a h 1 q v х y Ula a 5 ћ j 
a 1 1 
3 
B. BUSSIAN- >B- 1 5A B B p m Дук в X 3g Rind 6 |> Vis au | у 
ENGLISH: » Wo- 1; ; A B V O D т Bue 7H Z I + жтт & 6|2. ш eh 
А + a “|в: : 
HOH/ K KK КС JI M HON P C T y o X Х/ ц ц/ тд n [sis seo а 
J.Y KE Q X L хк о Р ROS тог XH H mB C bjr] e кр. af 7? 
Ч шш >b ы Ь э ю я ?/ | .2456789 |ч aly] nl 
ЮН ЗИ SECH " YR б MH YU YA : 2 /| . 2456789 bl E 51, p! 
| eh 2 | діж| асан 
C. Exlit- , 4 - 1 ; у жа b с а e eh f g h A M y ejan pp | rnm 
Russian i - I њ ~ ч а б uy A e э ф г х/ ft Pp >T А7 Ulu в 
џ а zh т 
db X т 1 nm n o Р 4 T в sh сц t tts п v 6 93| ж 1 1 
и Да ко X лом и о n. KK p с ut щ T n y B | 
Rj h и 5 |а” ту! 
i 1 
у | X y уа yh yu т zh " ?! : 2 / ,. ,е2456789 log a 3 Pone pil 
›в |кс й я ы ю 3 ж Bb bo: 2 / ё 2456789 Бірі x S 
19 shared; 14 Rus.; 7 Eng.; 15 ушаб.) | 14 anplioates; 4 ambig.; 4 obs.; 41 other. д ж 8 ig | ou ј sts 
D. Cyrillo АЕ т > AF? Y wọ F vs mj tis DR L| yp! wy 
maMuterution а ёа b + ў è mq 5» Y ~ oo rr ју aj ™ ііі? 
into Russian ‚а е п т ‚у ‚ч чїй ~y -ъу мв ма mrj 2 
and English: ја ja ур jt „u  ,.J0h -:i in ~ ow j ж=/ 2 cua x/ Р 
P h РЁ ‚к x 
3, J7 I о h! ок t БОМ 4y 8 8 LH425| a | Juil ess] rr 
s| J П 1 h æ K t B y ч 5 b A ind r| g 0 |o о [o 
мА! Xm/ њи oak X мн. ае гг пз/ т/ дж дж/ да дь 8 | sug] ;chl 
п и A а Jh ой ee ша gz] el zh а: am as г|ч 7; 2 + фут 
! м 
җ 3 І K K K Jb HW H b Ç. R X n Armenian punctuation: 
жаз k К жк љ ҥ HH ь 6 h хх д : . . > : Н 
ж/ af и оюк] ке/ K/ me mr нг/ шь 6 ть х/ ці $ 
zh/ | уп qi xf x/ LS. omg nig/ ans ef ti! h t:l > 22 "2 eee 
. . ... apre .... 
ЧӨ e у 6€ т у е 5 АЖ 8 no y "ux m НИ e 
Yaj ө a Y € ï у - 5 а Ж 35 "4 86 y ' ol ы Mod 
ч/ зо ъог BY ые ын „у "ЧІ ома а .Ж 003 „и ..0 ey ; : 5 Ч 
jeh/| "o "o? "u tie и u сав „а ша шаһ or ed eO ou Fic. 2. Armenian transliteration into Rus- 
uq; ы 9 У Obsolete, Be 11 9 Vr Жа rr sian and Eng 
qc ы 8 y identification: зе д Po "uu E 
өз шы 12.0 <“БУ transliteration: е и .ф и > г 
ан ге a 8; *P"u (31, "о in obs. Hus.) е E fe 24 Е g 
Bet su (ju ; wo ); ja (“ta + Abkh. ,,u); 24) ;oh/ (d:rh + Abkh. заш). 
Бара, | пи] se/ (вз) шаі а x/[sh bl stlareh/ ау а (анһ |а: ву |та or seul. 
f [exept Abkh. x/ (,x)] 
i Fic. I. Cyrillic transliteration into Russian and English 
S Transliteration Transliteration into 
rev | Russian | English Yiddish and Hebrew | 
X 1 without vovel points Vowel Points 
| 1 Bus. — Eng. Нер, .Heb. _ Rus. . 
5 »B v п), Below line  . 
3 » 5 a нк E › ” 
To aj œ a n { 23. еб “ rf el 2 Jus/ M n: T fe zu 
қы 1 6 b | we pty] р | xx! gt 1 | ? a } Si Ж Ж » 
b. ов b 1 қ 125. 203) 2 ш | sh n " в ... y . ы. . Е s 
4. коб | om х [od | 20. ње | f lye |,;оһ 1 T ж zh vu .:. "a = 
>. “gla э ећ | 17 её .1, |):в Ц Ж H 1 XN .. .. .. 
hog i » Y 28. deg 4 да | atz n ка й/ 7 n "i мо jo 
7 me | ®| afa | on Pe | 1 | прав! n тт КЕ ie 0:5 М ° о 
Веб | а | рз фе 5 | ЕНЕСІ! ` ы x 9 0:и Abore line 
ыы at u| 4 31. ы у | x [kh 5 bi MN ` .о Шы 
pepe у] у] ri] ЕЗ ЕГЕТ g ЕСІ СЕРІ 1 b x i + үт 00 latter level 
| = " 1 1 | ва. pa 31 ] x/| h "M e Ч ich on {| от 5 
2. x | m Obsolete : ; 2 , 
bee fe] eld gee шла 13 С эте 
А . . . D s ы Е ap ЫЙ T u 
je ok ~ • ^ |15,5 А i E М е Оп Jette 
1p. 34 3 zf | рі | 19,5 ог 20.5 ў ра b ' ї : id же 
10. gt 1 ж | rh FE ¥ 5 b: Р Е] eh an? ` М „а 
174 | 4| p| ст |20,25 or 57.25 1 e o MS Below line 
кыы vl ela] eae Е P. [PS | a ya оқи» a 1 
эх > final к 
ib. ga | ol от tl | 51.5 3 ox og P ч P ET ge On lettar level 
a. $9) a] 7 | = |58 > x o o v Ж Ü ш зад * ne 1 
a >“ јач| o^ |p iss Ф è + f ¢ gU а те МЕ 
ALL NE E n жаа Күш. 
|. Georgian trensoription symbols: і .1 ft ? m .% Above line 
К Із ж e ээ әм = › ге. , 
$ . t 
L У : (asz): Note: Two Russian letter-zyxbols and letter 
А concepts аге from the Hebrev alphabet 
"^ (after) (preceding) . . ак 
а d 2 2 22 тз I ¥ s Iv 


&/ в counterpart of kh; rir rolled г 
2: aspirate; :1 glottalic abruptive. 


Fia.'3. Georgian transliteration into Russian 
and English 


Fic. 4. Yiddish and Hebrew transliteration into Russian and 

English. Russian and English transliteration into unpointed 

Yiddish-Hebrew. (Note: The Yiddish-Hebrew alphabet is 
read from right to left.) 
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alphabet ‘share some: letter concepts with Russian, and also ^ 


| .- umlaut. (by 
" . tender оп tter (by following. slant [/], Spanish. tilde. 


roe e {в (с), ‘ch «e 


= "m m LEN xi v4 e 
4. E Ў 


"who E Yiddish. and - ase the Hebrew “alphabet for it: 
orgian, ard Yiddish "written in the Hebrew | 


:_ have some letter concépts which are not. foung in Russian. 
ШЕ "We use preceding , (comma) to indicate aspiration, and а 
"following exclamation ` ‘point (1), to indicate a glotallic 


-abruptive, and can then easily translitérate Armenian. and . 


Georgian , into: English | or Russian , (Figs. 2 and 3, notes 
5 oon 


AThe West РАНЕ outside The Soviet Union, who.use ` 


the same complete Armenian alphabet, :do not, pronounce 
the glotallic abruptives or the aspirates as such, and make 


15 с ев in the sounds of the letters: In West Armenian, 


the letter. concepts’ which are aspirated’ ` in East Armenian 
. [transliterated, with a preceding ,°. (comma)]: lose their 


^ aspiration without interchange, and, 'aside ‘from the loss of ` 


the 'glotallic. abruptive pronunciation, ‘there is Interchange? 


` + of the root phonetic’ values of the letters of the b and pl, 
p ' g and kl, ‘d and-tl, 4:2 and t:s and dizh and seh! pairs.- 
0. The suggested. transliteration system is based on the East . 
_Armenian used in the Soviet Union, but should suffice to - 


`1 transliterate , any Armenian text, regardless of how the 
. writer would pronounce 1.) 


-" For the-Hebrew-Yiddish alphabet we heed pedal Symbole: 
for the letter aleph:(!, exclamation point, not directly . 


^.) related in- this use to’ its use'to répresent glotallic аБгйр- 
': 'tivés), for the letter ауіп (7, question. mark), for the letter 
teth (t: t, doubled with ligature), and for the letter уау Gu) 
. (80-33; 17 pp. 84-85, and 19, p. 27) (Fig. 4). 


'The, punctuation ‘system 'perniits transliterating ‘French : 


. -Reute accents (by preceding. period), circumflex accents. (by 
i prece /: slant colon), French di&eresis or German 
preceding | double period), cedilla or extra ex- 


(by :’ [colon apostro hel following letter), and French 
'accent grave (by prec 
preceding N°. [number] sign into Russian). 


"The Latin alphabet Slavic languages, related to ‘Russian е 


" and of particular interest-to the Russians, average віх extra 
consonants, each differentiated "by didcriti¢al marks. 


To transliterate : „the: 11 Serbocroatian variants we need 


zo (ud), zh T i0: 0D, n: (ap, t: Се), kh Т? 
d:zh (d, 3); and sh (а). 
| To transliterate. Czech we need: а се е i jn о 
. gu (acute ‘accent initially, cirele above, etter medially and 
. finally); .y and ‚в. In Slovak, /:o can be used to indicate, о 
` circumflex. In botli Czech and Slovak, ‚О. and d:' and ‚Г 


and t: ге variant forms of the-same letter. Slovak Also, has. : 


ја l апа: т and л (u acute medial). 


In Polish, we need to, transliterate z'with an acute accent’ 


as jz, and to изе z:sz for z with а dot-over it, the z.or zh 
analog. of "hard" зя апа es. Polish transliteration then re- 
quires a/ ` 
5:87 (m with a dot over it):^ 

| Wüith::'-used to indicate а tilde-like sign over the original 


5 = letter, ~: used to indicate a horizontal line over or through a 


^ letter, and. the standard: usage of preceding .. for umlaut, 


алы preceding , for cup ‘or у over a letter, preceding . for acute 


· gecent, and following / for an extra. ‘extender, the- extra 
letters. їп ‘the Soviet non-Slavic Гап. alphabet. languages, 


, including Latvian, Lithuanian, and Estonian, can be trans- / 
literated. ава -:&.B/ ,c e -:e e/o g/ - if X V m. 


-0 0:1 А мл d/ and ,Z.(34-35, 17 pp. 75-79). 


44 Тһе punetuation: transliteration system appears 40 bean ` 
‘adequate solution to the. problem’ of reversible, English- · 


|. Russian-English transliteration of the names of scientists 


77. that would obviously. be as useful to the Russians as tò- 


е соети 
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1601 56th Street 
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Simulation of Boolean Logic 
Constraints Through the Use 
of Term Weights 


The evolution described below of one aspect of the NASA 
Scientific and Technical Information Facility’s machine 
search system may be of general interest to the documenta- 
tion profession. - 

The Facility began operations in early 1962. The litera- 
ture search service, or “demand bibliography" service, as it 
was then termed, was initially а very modest endeavor for 
the simple reason that the data base upon which to search 


had yet to be built. The first search programs concentrated 
on the well-known Boolean logic capabilities in the search- 
ing of inverted term files on magnetic tapes. This was 
consistent with the contractor’s (Documentation Incor- 
porated) prior R&D experience with so-called “Uniterms” 
and coordinate indexing systems. 

A major change was effected, beginning in January 1965, 


. to a serial or linear type of file organization. The reasons 


for this change were many and varied and need not concern 
us in any detail here. They involved, primarily, efficiencies 
in the file maintenance and update procedures and in the 
journal index preparation procedures. Also, it was becoming 
imperative to be able to search the file on a variety of non- 
subject, administrative categories of information. At the 
time of this change, additional capabilities were built into 
the new “linear” search system. To supplement the basic 
Boolean capability, we now, among other things, made 
available to ourselves the following strategies that were 
well known in the state-of-the-art: (1) a weighting tech- 
nique, (2) a “root” searching technique, and (3) a system 
of nonsubject “limits.” 

The weighting technique permits the assignment of 
arbitrary weight values to search terms and the specification 
of a minimum weight which any document must achieve in 
order to become a "hit." 

“Root” searching permits queries on any desired generic 
level of various entities, e.g., all contracts with the prefix 
NASS8-; all report numbers with the prefix RAE-; all 
authors with names beginning CAR-. It may soon be ex- 
tended to index terms, as in all terms. beginning 
“PNEUMO,” ete. 

The system of “limits” permits the specification of various 
additional constraints on a search other than those involving 
subject index terms. Nearly all the standard descriptive 
cataloging elements fall within this system. 

Each of these new capabilities has seen а great deal of 
use. The weighting technique, however, has particularly 
caught the interest of the searching staff and has resulted 
in some far-reaching developments. 

For instance, it is apparent that document weight be- 
comes а way of ranking search output in order of relevance. 
Probably the first use that weights were put to within the 
Facility was not to limit the output — the Boolean equation 
did this — but to arrange it for either the user or the analyst 
or perhaps both. This became extremely valuable in an 
environment where search output received a human edit 
before it was released. Arbitrary weight levels could be set 
by the analyst above which relevance to the question was 
assumed and below which his editorial effort was concen- 
trated. 

It also became apparent that the weighting technique 
could, by itself in some situations, achieve exactly the same 
resulis as a Boolean equation; cleverly assigned weighis 
could simulate such an equation. For example, the equation 
(1) A(B+ C + D) = Answer, can be completely bypassed 
through the following weight assignments: А=8, B— 1. 
C=1, D =1; Weight Limit — 4. This becomes very useful 
to know, for the calculation of weights was a much faster 
computer process than the solving of & Boolean equation, 
and the substitution could lead to significant computer time 
savings. Other common types of substitutions were the 
following: 


(2 А+В+С+р A=1,B=1,C 
Weight Limit = 1 








1,D=1 











(3 A-B-C-D A=1,B=1,C=1,D=1 
Weight Limit = 4 
(4) A+ (B-C-D) А=3В=4С=ђ, О =1 


Weight Limit — 3 
A-—2,B—2.C—1,D-—1 
Weight Limit — 2 
(6) (A--B)-(C-D) A=1,B=1,C 
Weight Limit = 5 


Various rules of thumb can easily be developed, and were, 
for the proper assignment of weights in more complex 
situations of the above basic types. However, no mathe- 
matical formalization was ever attempted. 


(5) (A+B) + (C-D) 








2,D—2 
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„7. processing them. Essenti 
<. -the equation, identifyi 
“ing weights and wei 


eA 


. ^. It was soon"realised that. though terii weighting had its ' | 
+, > advantages, nevertheless there were some equations that 
, Could not be reduced in this way. Two of ihe most basic . 


“are the following : . 
NOE (А + B) (C +, D 
(8) (А-В) + (C-D) 


' The above equations cannot be emulated’ „through апу” 


2 assignment: to their terms of positive or négative weights, 
:.in conjunction with a weight limit. This can be prove by 
"fairly simple: algebraie techniques which will not be gone 


lo into here. : 
Continuing, examination of the recalcitrant situations led : 


"tto! the development of a special “Group Weight” system for 

Шу this involves “multiplying out" 
‘its sections or groups, and assi 

t limits for each section. Equation (7 m 

thus becomes the Md ndent (ТА) A(C + D) --В(С.-- D) 


; S | "and weights may be assigned as follows: 


Wee 021, D=1; 
‚ Weight Limit = 4 

B=% G= D=; 

Weight Limit =4 : 


The — program is now in the process of being 


: Group А: A(C +D) 


(^ Group B: B(C-+D) 


Е .changed to permit this technique. Logical equations will be 


· made an optional, not a mandatory, feature of a search 
. question. All types of logical equations may,then be con- 
. verted solely to a system of term weights and weight limits. 
Tests have been run comparing search times for ten prob- 
lems coded by equation against the same ten coded with 


= weights; both sets being run on our IBM-1410 search sys- | 
zoo tem against the same single reel of the data base.- Results 
E indicate that there is & 4 to 1 time advantage to running 


ғ. 
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in the weight or arithmetic mode. However, it is clear. that» 


Е -complicated equations can be both difficult and laborious to^ 


code. The next step. із therefore obvious, In. those cases’ 


where weights would be used mainly £o simulate Boolean ~ 


logie for the sake of processing p there is no reason that: 
the program should not accept the equation and, calculate с“ 
its own weight assignments. is now being evaluated. 

It is thought. that this ub ease history in the use 
of weights may ђе of interest because of the widespread, . 
current use of weights іп шасһіпе search systems. Several“ 
systems seem to be dropping the Boolean capability per ве“ 
altogether in favor of weighta. Тһе two are generally spoken’ 


‚ of in these situations as disparate entities. It is not that: 


simple. The closeness of the relationship is shown by the 
fact that. the weighting. technique can be made to simulate ' 
Boolean logic. However, in doing: во, the weighting teeb- . 
nique can may become too difficult for oe 
use. On the other hand, the logical equation. ів perhaps the | 
most ашыр апа ‘easily comprehensible way, a search . 
question with a complex relationship of terms can be 
organized and displayed. Our own solution is to keep both: 
strategies in order to take advantage of the unique capabili- ', 


‚ ties that, each has to offer. At the same time, we'are > 


attempting to take advantage of the newly realized (at least - 
as far ав we are concerned d) relationship between the two ' 
systems by utilizing the fast weight calculation process ав a: 
iechnique for internal ` computer solving of & Јонг 
equation. ^ ` a. ^ : Te 


"У. Т. Вплырноват . | ; 
Assistant. Director. for. борае г 
NASA Scientific and Technical ` 
,Lrlormation ‘Facility - zi 
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Dear Sir: 

ГІ read with much interest the letter from Mr. Robert L. 
Birch in the January 1966 issue of American Documentation. 
I was particularly interested in his general rule No. 5 in 
the “Bibliographic Suicide-Guide: For Publishers and Au- 
thors,” included in his letter. The ASTM Bulletin was 
discontinued at the end of the year 1960. The name of 
the Society was changed from American Society. for Testing 
Materials to American Society for Testing and Materials 
Jate in 1961. The change in the Society’s name has nothing 
to do with the unfindability of the ASTM Bulletin in view 
of its discontinuance. In actual fact, the ASTM Bulletin 
which was published eight times a year was replaced by a 
new monthly publication, MATERIALS RESEARCH & 
STANDARDS, the first issue of which appears in January 


1961. 


Т. А. MARSHALL, JR. 
Executive Secretary 
American Society for Testing 
and Materials 
Philadelphia, Pennsylvania 





| Dear Sir: 


Mr. Thomas А. Marshall, Executive Secretary of the 


' American Society for Testing and Materials, has very kindly 


sent me а copy of bis letter about the notes on bibliographic 
suicide technique which appeared in the January 1906 issue 
of American Documentation. 

Mr. Marshall is right, of course, in that the 1961 change 
in the name of his society (from American Society for Test- 
ing Materials) could not affect previous efforts to find the 





ASTM Bulletin, which had already ceased publication. His 
letter brings out a very important point, however, concern- 
ing the “labeling” of publications for efficient retrieval from 
the informational mainstream. 

If Mr. Marshall means that the present findability of the 
very valuable material in the ASTM Bulletin remains 
unaffected, he is only partly right. Each change in the name 
of a journal or in the name of the sponsoring body silts over 
one of the strings by which the publication may be traced. 
In this case, efforts to find still-valuable articles in the 
ASTM Bulletin will be unaffected to the degree that they 
depend on the title as & tracing. Such a title, beginning 
with a set of initials, offers more opportunity for confusion 
in any alphabetical listing than would, for instance, such a 
title as Steel Research Journal, which is easy to remember 
and offers only one filing location. The ASTM Bulletin 
may be reasonably listed just before such a title as A to Z 
in Alphabeting, or between the ASPR Handbook and the 
ASTME Membership List; all of these, according to some 
filing traditions, would file before the Aardvark Foundation. 
Another place to look for the ASTM Bulletin would be 
between Asters and Asteroids and, say, Astounding Science 
Fiction. None of these locations, of course, woul be af- 
fected by a change in the interpretation of the initials or 
the name of the sponsoring body. И 

On the other hand, the effort to find ап ASTM Bulletin 
article may be through the name of the Society. А searcher 
looking under the present name of the ASTM would, of 
eourse, find no entry for the bulletin, since it ceased publica- 
tion before the change. If the searcher knows that the 
Society has changed its name and that the publication 


Letters to the Editor 


ceased before that change, he may seek under the former 
name. In any case, retrieval of ASTM Bulletin material 
has been affected by the change of name of the Society. 

I can sympathize, of course, with the editor who is impa- 
tient to get the substantive part of his work done, by getting 
the information in print and out to the subscribers. 

Forty years ago files were smaller and cross references 
could be made, at least in the larger files, so that a sub- 
seriber wishing to put in an order for a hearsay title would 
be quite likely to be able to identify it and send his order 
to the right place. The larger the files grow the more 
necessary it becomes that management study the biblio- 
graphic consequences of the wording and layout of titles 
and the other identification elements connected with a 
publishing venture. | 

In the last few months two major journals on materinls 
have been announced. Тһе title of one is Journal of Mate- 
rials Science. The other was announced as Journal of Mate- 
rials, but the illustration of the title page shows it as the 
ASTM Journal of Materials. An earlier publication, still 
being published, is Materials Research & Standards; but, 
considering the layout of the cover it will probablv be 
referred to as Materials, with the Research & Standards 
being treated as a subtitle. Possibly the title will be read 
Materials ASTM, with the rest ав subtitle. 

The rich increase in productivity in America made possi- 
ble by standards such as those prepared so painstakingly by 
ASTM has made possible a vast growth in the number and 
diversity of publication efforts. The efforts of directories 
pushes, and catalogers, and of manual and computer 

ibliographers, to keep track of the publications and make 
them findable, will be most successful for those which pose 
least problems of alternative filing location or memory 
strain. 

Тһе checklist of rocks апа shoals on which a new publish- 
ing venture may be wrecked (the "Bibliographic Suicide- 
Guide") which assumed that publishers were parties to & 
benevolent plot (to prevent searchers from finding the 
information on which they might have to take action) was 
designed to permit the launchers of new publications to 
decide, with better prevision, whether to have their efforts 
result in new sandbars or whether the new effort will become 
part of the informational mainstream, findable on demand. 


Ковевт L. Вінсн. 
Science Index Group 
Falls Church, Virginia 


Dear Sir: 


I would like to add some remarks about the article of 
Mr. Harry Baum, A Clearinghouse for Scientific and Tech- 
nical Meetings: Organizational and Operational Problems, 
which appeared in American Documentation, Vol. 17, No. 1, 
28 (1966). | 

Exactly as the number of journals in all disciplines of 
science and technology is growing ever more rapidly, so the 
number of meetings held is increasing, too. It is understand- 
able that oral, direct communications, and personal contacts 
are excellent for each group of specialists and give more 
positive results than even the best papers in the journals. 

Organizing а clearinghouse for meetings is an excellent 
idea and I agree with the author that here the cooperation 
between the orghnizers of meetings and the clearinghouse is 
the most important point. The success of the clearinghouse 
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z zdepends on the. — will of ‘the organizer. Sop рани: 
· the efficiency of 
‘if the organizers of;a meeting provided t 


takes place. 
. Tsee а second task for the clearinghouse in the publication 
. of proceedings of the meetings. This could be done together 


ur .with. the organizers and specialists chosen by them to ' : 


7 evaluate he papera so as to avoid publishing overlapping in- 
formation. Such a sifting would be of great value to scien- 


5, tists who have to go over the innumerable volumes of. pro- 


ү „сер ара of conferences. 


JThese-two: functions—quick ` information about а confer- 
ence, and publishing activities—-should be the goals of the’ 


future clearinghouse for scientific and technical ‘meetings. 


` ALINA GRALEWSKA 
‚© : . Head, Library & Technical Inf. Dept. 
PE A um" ' Soreq Nuclear Research Center : 


" ~ 


.Deàr Sir: 


-ï present for. your consideration as & поје | in American 
Documentation a definition' that appears in an unpublished 
1 that Т presented at Dillard ы (New Orleans, 

uisiana) in March of. this: year. The title of the paper 


, 
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is: ашан of the’ Great Baes for the’ аршы | 
е clearinghouse could be шешу enhanced `` | 


е clearmghouse | . 
„with advance: ‘copies of the abstracts of papers to be pre- : 
sented at the meeting." Such abstracts are usually submitted ~ 
: to the organizers at least half a year before the meeting Я 


"а modest reform in t 


vog ts 


Sciences." : 
' “Reflective, ‘Thinking’. Creative’ information retrieval 


Manton C. RHANEY LEN 

Dian, College of Arts and Боден. | 
Чотійа Agricultural and М echanical КОТ, 

таан Flonda | 1255 


Dear Sir: 


In your double role as editor of American Documentation Dc 


and member of the staff of the leading publisher of citation 
indexes, you are in à ор, effective, position to attempt 
e habits of most scientists and many 
documeritalists. : 
I.refér to the use of initials instead of full forename. 1 
should think this could wreak havoc with citation indexing 
when one gets to the -point that there are 50 different - 


‚ authors all with the name “R. Jones.” 


I would guess that this practice originates from the àsso- 


| ciation of a few hundred specialists who’ all know one 


another in ‘an "invisible. college. ? I also suspect a certain 


amount of arrogance—“Everyone should know. who. I ` 


am.” Of all people, documentalists, who sometimes must. 


index millions’ of citations, should not be guilty of this 
practice, 
. R. JORDAN 4 
(excuse. me, Romi тате JoRbAN) 
Council on Library Resources, . Inc, 
Washington, D.C. ` 
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7/66-IR INTREX: Report of a Planning Conference 
on Information Transfer Experiments. September 3, 
1965. Carl F. J. Overhage and R. Joyce Harman, ed. MIT 
' Press, Cambridge, Mass. 276 pp. 


The establishment of a four-year program of “informa- 
tion transfer experiments” is M.1.T.’s proposal for finding 
solutions to the growing pains faced by large research 
libraries. To set the stage, and formulate a coordinated 
program, Professor Carl Е. J. уеге and the other 
INTREX planners organized & five-week planning confer- 
ence attended by librarians, documentalists, scientists, engi- 
neers, publishers, and others. Тһе actual discussions and 
! work sessions were held from August 2 to September 3, 
1965; the place was the National Academy of Sciences 
Summer Studies Center at Woods Hole, Massachusetts. 
The five-week conference was sponsored by the Indepen- 
dence Foundation of Philadelphia, Pennsylvania. 

The report’s opening statement describes the objective of 
the INTREX experiments as follows: “. .. to provide а 
design for evolution of a large university library into a 
new information transfer system that could become opera- 
tional in the decade beginning in 1970.” The timetable for 
the series of experiments, and follow-on installation and 
implementation, allows four years for experimentation and 
two or three years for development and construction, so 
that the target operation time probably centers around 1975. 
With target times set, and more or less specific investigative 
goals, it is clear that the МІТ. effort should not be thought 
of as a continuing library research effort, but rather as a 
carefully-limited development project. The care shown by 
the planners in insuring the realism of the experiments 
(selection of a real library environment, gathering of a 
corpus of machine-readable data from actual operatin, 
sources) supports the feeling that the experiments will 
yield practical constructive results. 

About half of the INTREX report’s 276 pages is devoted 
to the report text; the other half consists of twenty papers 
prepared by conference participants on various related 
subjects. The report text comprises a summary section and 
seven chapters. The first six chapters develop a picture of 
the INTREX concept of the future on-line intellectual 
community, and the last chapter, in seven sections occupying 
80 pages, details the proposed INTREX experimental 
program. { 

In the first section, discussing the Model System, the con- 
trast is drawn between most traditional library catalogs, 
which include only the “barest minimum” of information, 
and the proposed INTREX “augmented catalog,’ which 
will not only include additional information such as table 
of contents, abstract, intellectual level, ete., but also infor- 
mation about the catalog itself, such ав "records of its own 
use and use of the documents recorded in it.” The catalog 
would also include data about unpublished works and link- 
ages to other files. The remainder of the discussion of the 
&ugmented catalog deals with file organization, search, 
equipment requirements, and other questions for investiga- 
tion. Finally, under “text access,” the many variations 
of forms, media, uses, and environmental conditions are 
suggested for consideration during experiment design. 

Ап all-too-short (5 pages) section entitled "Integration 
with National Resources" explores the possibility and ad- 
visability of making available the resources of a large 
number of libraries and information centers to individual 
users. The initial INTREX experiment along this line 
would tie in with two major specialized information centers 
— National Library of Medicine and the NASA informa- 
tion system, Other users, both specialized and general, are 
also contemplated, to add scope and experimental validity. 


À section on Fact Retrieval proposes research of more 
enlarged breadth of vision. In fact, this area of study seems 
to belong to а generation different from that of the other 
INTREX experiments. It is proposed to organize the con- 
tents of whole groups of basic reference works so as to pro- 
vide factual responses to questions, rather than references 
to works containing such responses. By combining the in- 
dexing, manipulative, and computational power of the gen- 
eral-purpose computer, far richer and more detailed inquiries 
would be susceptible to useful treatment. Тһе automated 
index, automated handbook, and even automated notebook 
—in which a researcher would record individual and in- 
terim notes— are briefly discussed. The implications for 
the future, following &long these lines of research, are only 
mentioned, but the view from this height begins to make 
one dizzy, with its suggested implications of availability of 
all recorded knowledge | 

The next section, devoted to initial INTREX facilities, 
comes back to earth, with the description of the current and 
soon-to-be-installed computation facilities at МІТ. (GE 
645 and ІВМ 360/67), and other support, both "hard" and 
“soft” The recommended initial hardware complement in- 
cludes a time-shared computer system, and a hierarchy of 
slower, larger-volume storage devices, such as magnetic-disc, 
magnetic-chip, and image-microform stores. There will be 
coaxial cables to the 10 to 30 subscribers, flexible, high-reso- 
lution CRT displays and interaction consoles, portable, 
personal, book-size microform viewers, and remote-inquiry 
type terminals (typewriters and advanced typewriters). 

A series of experiments and speculations are discussed in 
a section entitled “Related Studies: Extensions and Elabo- 
rations.” More or less tied to the role of the on-line library 
in educational processes at the university are: provision of 
classroom back-up information, self-education, selective dis- 
semination of information, and browsing. Particularly in- 
teresting are the suggested experiments to test the validity 
and utility of “accidental discovery” through conventional 
and “electronic” browsing. Experiments with use of on-line 
information to facilitate publication are proposed, includ- 
ing cooperative endeavors by editors, reviewers, and authors 
while drafts are "in the system." Publication through & 
microform system, even while in rough manuscript form, is 
discussed, and situations under which this is advantageous 
are mentioned. Also briefly mentioned is the notion of 
“on-demand” publication, whereby on-line text would be 
reproduced on request, either from microform or digitally- 
recorded data. Finally, experiments are suggested to investi- 
gate methods for reaching decisions on weeding out (or 
selective retention). 

Areas for research and development in support of the 
INTREX objectives are: consoles, interaction language, 
content and user needs analysis, and information transfer 
theory. Research and development obviously deserves ex- 
tended treatment, but the section on R and D goes into the 
field only enough to whet the appetite. 

Finally, a section on Data Gathering for Evaluation sug- 
gests several ways in which the system can be made to keep 
its own records. They are discussed under the following 
headings: data on use, economic controls, data on learn- 
ing, and an annotated user’s card. 

In the twenty papers (it is understood that these were 
selected from more than a hundred) comprising the last 
half of the report, some discuss concepts which found 
their way into the recommended experimental program. 
The subjects discussed include network experiments, user 
studies, graphics, browsing, content-analysis, indexing, edu- 
cation, interaction languages, measurement, modeling. All 
are thought provoking. Particularly penetrating were Dr. 
Vannevar Bush’s remarks regarding the tremendous impact 
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: this reviewer that the planners’ 
be possible to attack im four: years. Attempts to go too far . 


· on society of: кыр на "of ‘analytic ‘machinery 


in libraries and information systems.’ 


> The INTREX report, with its proposed басар КӨТЕ AX 


mental program, ‘details a wide range of investigations show- 
great imagination and compre ensiveness. It seems to 


in some areas may lead to disappointments which will hurt. 
the orderly progress'of library technique researches. At the 


'. same time,'too little was said about’ relatively mundane 


matters such ан the need for efficient’ organization of süb- 


P ject-matter terms, search strategies, and indexing techniques. 


It is apparent ‘that; even though INTREX experiments 
will undou tedly yield 


'' ence-and-technology slant tend to overémphasize the ful- 


* fillmeht of current science information needs. The inclusion 
of proposed “fact retrieval" éxperimerits not.only under- 
scores this but would tend to add new, possibly. overwhelm- 


' ing; dimensions to the library problem, especially affecting 


. ' storage requirements. Pethaps the tendency in several 
" „тесеп automation proposals to, marry the library and the 


information: system has been’ hasty, although this may 
become feasible-some day (“information system” is used 


. here in the sense of automatically supplying answers to 


‘queries rathér than references to sources of answers). Simply 


to enrich a large library's bibliographic apparatus (installing > 


. modern’ techniques, adding tables of contents, abstracts, 


7” additional subject indexing, better service and control, ete.) 
‘stretches the state of the art. But it is probably not reason-: 


"able within the time-frame under discussion to make com- 


`., plete contents of works available electrically, except in spe- 
^ .eialized situations. ` 


Assembling an all-star ‚сав. Professor’ Overhage:. has 
focused an impressive concentration of technical skill in а 
‘dramatic bid to create an image of. the future, library. To 
shepherd .over 70 distinguighed participants. and Visitors 


| through five weeks of discussion, experimentation, and crea- 


- tive’ output 


volumes for his management savvy and 
. diplomatie 


Пашу сору in mid-October) of a finished, well organized 


: report is evidence of a high; order of editorial determina- ` 


* tion.. We can all look forward toa stimulating four years 


2 thatit can be applied 


| `, and uhderstanding of its basic concepta is vague. 


5 156 А 


: of productive experiments in the information sciences. 


SAMUEL B. SNYDER ` 
i Information Systems Specialist 


| RIS | 20007 The Library of Congress 
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СЕЛІ Alphabetical Subject Indication of Informa- 


tion. Vol. 3. 1966. John Metcalfe. Rutgers Series:on Sys- 
tems for the Intellectual Organization of Information. 
Graduate School of Library Service, Rutgers, the State 
University, New Brunswick, New Jersey., 148 рр. | 


Mr. Metcalfe, of the University of New South Wales; hag 


< had a long- and ‘distinguished career as a reference librarian, 

өз.” library science instructor, 
; specialty is bibli 

2. books in the field of indexing, subject cataloging, and subject . 

' classification are already known in the library fie ald. 


апа. library administrator, His 
phie organization, and his previous 


His most recent, work is based on his presentation made 
‘before a panel at а recent seminar in the Rutgers University 


Series on -Systems for Intellectual Organization of Informa- . 


tion, sponsored by the Graduate School of Library Science. 
e basic concept of Alphabetical БҮРЛЕРІ Indication of 


. Information is the intuitive system of known names in 


known order. Familiar examples can be found in the Library 
Oof-Congress catalog and the Wilson indexes. According to 
the author, the basic advantage of alphabetical indexing i is 


without the coding difficulties of some .other systems. 
"However, Mr.. Metcalfe finds that the system is in an 

-imperfect stage, since little is being done to improve it 

‘hus, the 

. copying of the National Catalog is often done uncritically 
for’ reasons of economy and because of the increased ina- 
= of iu copyists-to be critical. 5 
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PW 


ave laid on more than will, 


у much valuable constructive data and. 
. ` insight for all libraries, ће МІТ. locale ‘and obvious sci- 


eaks 
skill Furthermore the delivery, within six 
. weeks (the conference ended on September 3— I received ' 


io an unlimited number of items 


. British National Bibliography, Ranganathan’s’ Colon. Clasgü-. . 
Classification, Universel Decimal; 


ment, the author devotes: more than half of his work to 


A DTI 
To prove siat alphabetical, E aduon сап. Servé E 
many purposes, althoügh-tho system is in^need of. improye-' 


. tracing the development of this and: other subject indication-. e 


systéms and classification, orders.- А critical analysis of- 


Cutter, Schwartz, Provost, Anderson, and: Kaiser is pre- ©. 


sented.- In the field of .classification ‘the ‘Standard: ‘Catalog, 


fication, Dewey’s Decimal 
Classification, and the British’ Technology . Index are exam- 


ined. "After the ihtroduction to Alphabetical Subject Indi-:-. 
cation and the chapter on. Historical i n the book" .: 
Input £o.the--:- 
System ; IV, The Store to be Searched; V, Searching, Мейһ- 2... 
plicatións.for whiéh.. - 
nsuited; VII, Evolu--'.' 


is divided into-six additional, chapters: 


ods and Output; VI, Discussion of A 
the System is Theoretically Suited: or 
tion of the Method; and VIII, Seminar Panel Discussion. 

Undoubtedly, Mr. 


and the comparative analysis of other systems. In addition, 


КНИНУ Metcalfe’s main contribution is his ,' 
logical.defense of the system supported . by historical. facts: . 


: his chapter on. tools for the construction of ‘indexes i is well Y 


organized. : Dn 
However, his discussion: of chain indexing i is at times: сой- 


fusing. Moreover, the author, while aiming at a compré-_~--. 
hensive treatment of the subject, did not include- recent "о 
research in: this field. Computer-produced indexes, with ће“, 
ontext indexes, arè- also ’ 


exception of Key Word in 


Omitted. Nonetheless, both .documentalists and librarians D 


will welcome the appearance of this report as an excellent' A 


guide to но in n alphabetical subject indexing. 
i> ковав І. Lawicky 


H.W. Wilson. Company ; ` | 


' 


7/66-3R Principles of Automated Information Re- 


trieval; 
Elmburst, “ПІ. 439 pp. 


In Principles of та Јаја “Retrieval, Mr. 
As the introduction states: 


1905. William F. Wales. The Business Press, 8) 


-William F. Williams sets himself noble and laudable. goals. is 
“This book is intended to ~. 
eradicate an imaginary and rapidly disappearing boundary ~:~ 
: line between’ data processing’ systems and 

. trieval systems. For the executive of either a large or a, 


information ге. 


small organization; it will serve as an introduction to infor- 


mation retrieval systems. . 


. cedures personnel, systems analysts, and programmers, ‘it 


. quate authority and back 


will serve, ав an instruction’ manual in the design and opera- 


. For librarians, systems-pro- 


tion of information retrieval systems. For.the executive. ' 


ages decisions to improve these- functions and'f 
und. P? Mr. Williams’ pur- 
pose, then, is to provide 


-responsible for business functions which demand, improved- а 
management and control of information; this book «ош uM 
ев ade-^. . 


, 


'ONE source book of necessary <>. 
А information for „the many varied approaches to mechanized: : ` 
'. information systems, and to answer the needs of manage-- · ' 


ment, librarians; systems and procedures writers, systems `,“ 


analysts, and programmers with one volume. "The fact that, . 
id my judgment, he does not-succeed in this endeavor does, | 


“not change my feeling that this is a worthwhile book. 2.2220 

. Mr. Williams; who has had considerable information sys- .- 

, tems experience with a number.of industrial organizations, 
including the pioneering efforts at DuPont, and who is ртев-. ^ 

` ently Manager of General Electric’s Marketing Information jd 


Systems, certainly knows the area about which he is writing, 


and his book gives ample,-evidence of the fact that his P 
knowledge is based оп real, firing-line experience in ihe .", 
design and operation: of information systems, and not on. ^ 


vague, specialized, and highly theoretical conclusions. Моге- <> 
over, he has compiled the information for his work with a ~ 
‘great deal of care and effort. The 439 pages are Ieplete: with C 


diagrams, flow charts, photographs, and 


escriptions of tech- 
niques and equipment. The 27-pa 


both carefully and. satisfactorily done. 


I suppose the reason for my qualis about “the ook is - 


that it tries to be too many things for too many people, 
and it doesn’t quite, succeed in pulling the rabbit out of the 


4 


glossary 18 в combina- .”- 
. tion of definitions from the library, documentation, апа ``: 
.data processing fields, and the bibliography and index are 7 


bat. In the early chapters Mr. Williams tries to write for the 
busy executive confronted with the decision of whether or 
not to automate. Chapter 1, in fact, is optimistically entitled 
“What Executives Should Know About Initiating New In- 
formation Retrieval Systems.” I may not be the busy execu- 
itive at whom Mr. Williams is aiming his message, but I 
‘would find the general bits of information contained in 
the 22 pages of chapter 1 of little -help in “confirming [my] 


' suspicions that the time is ripe to use information rétrieval 


fully.” Mr. Wiliams is an engineer and systems man and 


· his background, of necessity, flavors his writing. If he does 


not succeed in being all things to all people, I cannot really 
fault him for this; I doubt that anyone could be. He сег- 
tainly tries to approach each discipline on its own level of 
comprehension. The book is full of useful data in tabular 
and graphical form. For those who presumably don't have 
other access to it the author even ineludes a breakdown of 
the Dewey Decimal system, and a listing of 375 "stop" 
words for use in Permuted Title Indexing. Nevertheless, as 
the coverage of the subject proceeds, the mathematical 
formulas and network charts get thicker, and the going for 
the non-systems engineer gets rougher. 

The author can probably be forgiven his biases in favor 
of the work of his employer, the General Electric Company. 
Although other examples nre brought in, the book is heavily 
flavored with GE examples and СЕ systems. Аз an example, 
thé permuted title listings are drawn from a 1962 GE 
manual, апа little if any mention is made of the earlier 
and generally more fundamentally considered work of the 
late H. P. Luhn. Statements such as “For purely technical 
information systems, General Electric Co. has made the 
greatest advance in complete structuring of information" 
are open to detailed and heated debate, and certainly require 
greater substantiation than they receive. 

Although the purpose is not stated as such, Mr. Williams 
book рш to be designed for use as a graduate textbook. 
Half of the book is given over to chapters concerned with 
fundamental approaches to documentation, abstracting, in- 
dexing, coding, storage, retrieval, and vocabulary control, 
and all chapters end in a series of questions to test the rend- 
ers (or student's) comprehension of the material covered. 
Terms are carefully defined as introduced, and every at- 
tempt is made to proceed in logical sequence. It could 
almost be assumed that this book was written to serve as 
n textbook for a course or series of courses on information 
retrieval to be taught by Mr. Williams and hopefully emu- 
lated by others, апа this may, in fact, be the case. 

Despite its shortcomings, the book is well written, basically 
well documented, and full of all kinds of information of 
value to those of us who work in this field. It makes a 
worthwhile addition to our reference shelves and to those of 
our technieal libraries. 


НЕвВЕЕТ S. Warre 

Execulive Director à 

NASA Scientific and Technical Infor- 
mation Facility 


7/66-4R International Affairs. Universal Reference 
System, Political Science, Government and Public Policy 
Series. Vol. I. 1065, Metron, Inc., 80 East 11th Street, 
New York. 1205 pp. 


No one will argue the fact that lack of bibliographical 
control coupled with an ever increasing flood of literature 
has plagued social scientists for decades. Efforts of varying 
magnitude to cope with the literature access problem have 
been projected, discussed, and in some cases, applied. Few, 
if any, of these projects have produced significant results. 

The Universal Reference System (URS), as conceived by 
Alfred de Grazia, is a computerized documentation and in- 
formation retrieval system which will attempt to provide 
multifaceted access to the substantive literature of the social 
and behavioral sciences through selection, storage, and in- 
dexing in depth. The services of the URS are to be made 
available on an individual basis by means of automatic 
printout and also in published form. a 
. The volume under review is the first to appear of & pro- 
jected series of ten volumes, each of which will cover some 


phase of political science, government and public policy. 
It contains citations and full annotations to 3,030 books, 
articles, and documents dealing with all aspects of interna- 
tional affairs, drawn from the literature of economics, soci- 
ology, anthropology, psychology, and history as well as 
political science. The nature of the selection process is 
somewhat vague, but emphasis is upon those works which 
exhibit strong methodological characteristics. Titles which 
are purely evaluative, journalistic, non-empirical, or intuitive 
have been generally excluded. Although the period covered 
is primarily the twentieth century, with major emphasis 
upon the post-World War II era, the “classics” of interna- 
tional affairs have been included. Foreign titles make up 
only 5 percent’ of the total works cited and it is hoped that 
the future will see greater expansion in this area. Despite 
the lack of a well-defined basis of selection the end result 
is to be commended. The titles included are of high quality 
and the annotations which accompany each citation are 
clear and concise, emphasizing scope, methodology, and 
findings or conclusions reached. 

The major key to the URS is iis computer indexing sys- 
tem which is based on a general classification of the meth- 
odology and techniques of the social sciences devised by 
Professor de Grazia. Some 183 standard descriptors make up 
a “topical and methodological index” to which are added 
121 “unique descriptors” which are particularly significant to 
international affairs, e.g., Nationalism, Nuclear Power, Cold 
War, and individual geographical names. Each document 
cited has been assigned from ten to twenty descriptors in 
order to illustrate all of its important facets. 

In the first major part of the volume, the “Catalog” 
(pp. 1-212), all documents cited are randomly listed, giving 
for each: full bibliographic information, an excellent anno- 
tation, and & listing of all descriptors assigned to the title. 
The second part, “Index of documents” (pp. 213-1197), 
is an alphabetical descriptor index with each entry com- 
prised of four columns. Each descriptor is listed in the 
first column with symbols indicating whether the work cited 
is а book, a long article, or a short article, accompanied by 
the date of publication. The second column lists three or 
four other “critical descriptors” — those which indicate the 
major facets of the work. Column three lists all other 
descriptors assigned to the document. Column four indi- 
cates the serial number of the document as it is listed in 
the “Catalog.” It should be noted that all descriptors, except 
cross references, listed in the "Index of documents” appenr 
in (runeated or abbreviated form, e.g. 


POL/PAR —Роћ са! party 
SELF/OBS—Self observation 
BAL/PWR—Balance of power 
REV —Revolution 
SIMUL —Scientific model 
WOR-5  —World wide to 1945 


Other features of the volume include: (1) a table of the 
standard descriptors arranged in their logical classified se- 
quence giving both truncated form and expanded defini- 
tions; (2) & frequency table of all deseriptors, arranged 
alphabetically with page references to the "ides of docu- 
ments"; and (3) an alphabetical author index. 

Utility of the URS system in general and of the interna- 
tional affairs volume in particular will of necessity be mea- 
sured over the passage of time. It represents a substantial 
departure from the more conventional formats of biblio- 
graphical apparatus and necessitates precision and ingenuity 


‚ on the part of the user in stating his requirements. Mastery 


of the classification system and the descriptor list is essen- 
tial to effective use and scope limitations must always be 
paramount in the user’s mind. 

Despite the fact that the new format is awkward to use 
and seems strange to those accustomed to other tools, the 
producers of the URS must be congratulated on their 
pioneering efforts to provide social and behavioral scientists 
with a new and variegated approach to their monumental 
information problems. 


Тномрвок M. LITTLE 
Assoctate Director of Library Services 
Hofstra University 


American Documentation — July 1966 151 


“4 


Pd 


-Progress in Information Science апа "Technology 


n 


The 29th Annual Merti of the Anierican Documentation Institute will be held i in. Santa Monica," 


xit Ы California, on October 3 through 7, at the Miramar | Hotel. ' . 


. PROGRAM OUTLINE. i 


“Тито RIAL SESSIONS - боор 3, 1966 
Information eae Design — RM. нағыс UCLA -> 25.2 E E be, 
‘Information Cénter Operations — A. Kent ~ University: of Pittsburgh - 
Usage of Information — S. Hemer — Hemer and Со. > ` Ln | 
Evaluation of Hardware and Software — To be announced. | E NEU 
Language Data Processing — H.P. Edmundson — SDC ' 5 МЕ | Ея 


7 регин ofa ыра - D. Hillman — — Lebigh University 


=i "Special Soon 3 Е КЕ, | у TAL 2 ` A XEM 


а» а 


STUDENT PROGRAM — October 3 1966: 


ерее шона у "EC n p. Уа ms wi an 
. , Cocktail Hour. | ' ) и E». $ be '. іу PU inet Sosa 


E E J- Harvey. - Chairman Student Membership Committee 
ES PROGRESS REVIEW SESSIONS - Octo: 47, 1966 - 


Professional Aspects of S MEAM Science and тарт R. 5. Taylor - Lehigh University 
: ' Information Needs and Uses. — Н. Menzel — New York University, . 
2 ‘Content Analysis, Specification and Control for Document Retrieval Systems - ІР. Baxeridale - EMO 
.Filé Organization and Search Techniques.— D. Climenson - U.S. Govemment e. it 
‘Man-Machine Communication – К.М. Davis — Dept. of Defense . колы ж ERN 
|, Evaluation of Indexing Systems — C.P. Bourne — Programming Ыбы» 195001909. СЕТИ 
`, Automated Language Processing ~ R. F. Simmons — SDC : 
'.New Hardware Developments ~ M.E. Stevens — National Bureau of Standards 
' Information; System Applications - =J. Baruch — Bolt, Beranek and Newman. 


' ‘Library Automation — D.V. Black — University of Califomia, Santa Cruz 


Information Centers and Services — G:S. Simpson — Batelle Memorial Institute - 
National Information Issues.and Trends ~ J. Sherrod ~ Atomic Energy Commission E 


“SPECIAL FEATURES - 


Author Forims. NE Ck ` ' Special Interest’ Groups 
Discussion Groups ^ ^" . | Proceedings. 
‚ ,Prize Papers ` . Award of Merit 
2 . Special Libraries Association Session ‚ Exhibitor Presentations ` 
pO po Placement Service E . Tours . : s E 
mm M Exhibits E an , Evening in Disneyland HN E 
Information Theater- ў ЕЕ Buffet Luau e D 


Шай. Officers Workshop A ме ЖАШДЫ. 


А distinguished international publication of 
continuing value to documentalists... 


INFORMATION STORAGE AND RETRIEVAL - 


Including Mechanical Translation 


Editor-in-Chief: J. Farradane 
Institute of Information Scientists Ltd., London 


Тһе task of recovering knowledge from those who produce it, or more 
specifically, from the written records of their work, for transmission 
to those who need it has been widely recognized as an urgent prob- 

` lem. The waste of effort involved in repetition of research already 
done elsewhere is increasingly expensive. The loss to the world of 
unrecovered knowledge may be incalculable. 


Information Storage апа Retrieval, represents a serious endeavor to aid in the 
coordination of these efforts into an organized discipline, where one may proceed 
methodieally from facts to theories, from theory to new advances, and thus on to 
the solution of present and future problems. Distinguished editors from three 
continents insure that published articles are of the highest technical quality and 
readability — the governing criteria being excellence and timeliness. 


Information Storage and Retrieval, is published quarterly by Pergamon Press at 
а subscription rate of $30 per annum for libraries, government establishments, 
research laboratories, ete. For individual orders, please write directly to the 
publisher. 





Automatic Documentation 
Edited by А. Ghizzetti, Instituto Naxionale Per le 


Modern Trends in Documentation 
Edited by Martha Boaz, University of Southern 


California. This book discusses the individual 
needs of libraries served by information retrieval 
systems, problems of locating information, me- 
chanical translation by automatic computers and 
language data processing, automatic encoding 
and data retrieval by microfilm, magnacard and 
minicard. 103 pp. $4.50 


. Digital Computers in Action 


By A. D. Booth. Ап introduction to the digital 
computer and its varied programming possibili- 
ties. 152 pp., flexi-cover. $2.95 


Applicarone de Calcolor, Rome. Papers pre- 
sented at NATO Summer School in Venice. Con- 
tents: Four Lectures on Algebraic Linguistics 
and Machine Translation/Automatic Translation 
of Languages/Un Systéme morphologique, com- 
promis entre les Facilités de la Compilation, les 
Recherches Syntaxiques et l'Adaption a de futurs 


. Programmes de T.A./On the Equivalence of 


Models of Language used in the Fields of Me- 
chanical Translation and Information Retrieval/ 
An Introduction to Computational Procedures in 
Linguistic, Research / Syntax / Syntactic Integra- 
tion Carried out Mechanically/Langages Artifi- 
ciels, Systémes Formels et Traduction Automa- 
tique. 242 pp. $15.00 
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hat. In the early chapters Mr. Williams tries to write for the 
busy executive confronted with the decision of whether or 
not to automate. Chapter 1, in fact, is optimistically entitled 
“What Executives Should Know About Initiating New In- 
formation Retrieval Systems." I may not be the busy execu- 
tive at whom Mr. Williams is aiming his message, but 1 
would find the general bits of information contained in 
the 22 pages of Chapter 1 of little help in “confirming [my] 
suspicions that the time is ripe to use information retrieval 
fully." Mr. Williams is an engineer and systems man and 
his background, of necessity, flavors his writing. If he does 
not succeed in being all things to all people, I cannot really 
fault him for this; I doubt that anyone could be. He cer- 
tainly tries to approach each discipline on its own level of 
comprehension. The book is full of useful data in tabular 
and graphical form. For those who presumably don't have 
other aecess to it the author even includes а breakdown of 
the Dewey Decimal system, and a lisüng of 375 "stop" 
words for use in Permuted Title Indexing. Nevertheless, as 
ihe coverage of the subject proceeds, the mathematical 
formulas and network charts get thicker, and the going for 
the non-systems engineer gets rougher. 

The author can probably be forgiven his biases in favor 


of the work of his employer, the General Electric Company. 


Although other examples are brought in, the book is heavily 
flavored with GE examples and GE systems. As an example, 
the permuted title listings are drawn from a 1962 GE 


. manual, and little if any mention is made of the earlier 


and generally more fundamentally considered work of the 
late H P. Luhn. Statements such as “For purely technical 
information systems, General Electric Co. has made the 
greatest advance in complete structuring of information" 
&re open to detailed and heated debate, and certainly require 
greater substantiation than they receive. 

Although the purpose is not stated as such, Mr. Williams' 
book appears to be designed for use ns & graduate textbook. 


‚ Half of the book is given over to chapters concerned with 
; fundamental approaches to documentation, abstracting, in- 


dexing, coding, storage, retrieval, and vocabulary control, 
and all chapters end in a series of questions to test the read- 
er’s (or student’s) comprehension of the material covered. 
Terms are carefully defined as introduced, and every at- 


` tempt is made to proceed in logical sequence. It could 
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nlmost be assumed that this book was written to serve ns 
п textbook for a course or series of courses on information 
retrieval to be taught by Mr. Williams and hopefully emu- 
lated by others, and this may, in fact, be the case. 

Despite its shortcomings, the book is well written, basically 
well documented, and full of all kinds of information of 
value to those of us who work in this field. It makes a 
worthwhile addition to our reference shelves and to those of 
our technieal libraries. 


НЕВВЕЕТ S. WHITE 

Executive Director - 

NASA Scientific and Technical Infor- 
mation Facility 


International Affairs. Universal Reference 
System, Political Science, Government and Public Policy 
Series. Vol. I. 1965. Metron, Inc., 80 East 11th Street, 
New York. 1205 pp. 


No one will argue the fact that lack of bibliographical 
control coupled with an ever increasing flood of literature 
has plagued Social scientists for decades. Efforts of varying 
magnitude to cope with the literature access problem have 
been projected, discussed, and in some cases, applied. Few, 
if any, of these projects have produced significant results. 

The Universal Reference System (URS), as conceived by 
Alfred de Grazia, is a computerized documentation and in- 
formation retrieval system which will attempt to provide 
multifaceted access to the substantive literature of the social 
and behavioral sciences through selection, storage, and in- 
dexing in depth. The services of the URS are to be made 
available on an individual basis by means of automatic 
printout and also in published form. : 

The volume under review is the first to appear of а pro- 
jected series of ten volumes, each of which will cover some 


phase of political science, government and publie policy. 


. It contains citations and full annotations to 3,030 books, 


articles, and documents dealing with all aspects of interna- 
tional affairs, drawn from the literature of economics, soci- 
ology, anthropology, psychology, and history as well as 
political science. The nature of the selection process is 
somewhat vague, but emphasis is upon those works which 
exhibit strong methodological characteristics. Titles which 
are purely evaluative, journalistic, non-empirical, or intuitive 
have been generally excluded. Although the period covered 
is primarily the twentieth century, with major emphasis 
upon the post-World War II era, the “classics” of interna- 
tional affairs have been included. Foreign titles make up 
only 5 percent of the total works cited and it is hoped that 
the future will see greater expansion in this area. Despite 
the lack of a well-defined basis of selection the end result 
is to be commended. The titles included are of high quality 
and the annotations which accompany each citation are 
clear and concise, emphasizing scope, methodology, and 
findings or conclusions reached. 

The major key to the URS is its computer indexing sys- 
tem which is based on a general classification of the meth- 
odology and techniques of the social sciences devised by 
Professor de Grazia. Some 183 standard descriptors make up 
a “topical and methodological index” to which are added 
121 “unique descriptors” which are particularly significant to 
international affairs, e.g., Nationalism, Nuclear Power, Cold 
War, and individual geographical names. Each document 
cited has been assigned from ten to twenty descriptors in 
order to illustrate all of its important facets. 

In the first major part of the volume, the “Catalog” 
(pp. 1-212), all documents cited are randomly listed, giving 
for each: full bibliographic information, an excellent anno- 
tation, and a listing of all descriptors assigned to the title. 
The second part, “Index of documents” (pp. 213-1197), 
is an alphabetical descriptor index with each entry com- 
prised of four columns. Each descriptor is listed in the 
first column with symbols indicating whether the work cited 
is a book, a long article, or a short article, accompanied by 
the date of publication. The second column lists three or 
four other “critical descriptors” — those which indicate the 
major facets of the work. Column three lists all other 
descriptors assigned to the document. Column four indi- 
cates the serial number of the document as it is listed in 
the “Catalog.” It should be noted that all descriptors, except 
cross references, listed in the “Index of documents” appear 
in truncated or abbreviated form, eg., 


POL/PAR —Political party 
SELF/OBS—Self observation 
BAL/PWR—Balance of power 
REV —Revolution 
SIMUL -—Scientific model 
WOR-45  — World wide to 1945 


Other features of the volume include: (1) а table of the 
standard descriptors arranged in their logical classified se- 
quence giving both truncated form and expanded defini- 
tions; (2) a frequency table of all descriptors, arranged 
alphabetically with page references to the “Index of docu- 
ments”; and (3) an alphabetical author index. 

Utility of the URS system in general and of the interna- 
tional affairs volume in particular will of necessity be mea- 
sured over the passage of time. It represents a substantial 
departure from the more conventional formats of biblio- 
graphical apparatus and necessitates precision and ingenuity 
on the part of the user in stating his requirements. Mastery 
of the classification system and the descriptor list is essen- 
tial to effective use and scope limitations must always be 
paramount in the user’s mind. 

Despite the fact that the new format is awkward to use 
and seems strange to those accustomed to other tools, the 
producers of the URS must be congratulated on their 
pioneering efforts to provide social and behavioral scientists 
with a new and variegated approach to their monumental 
information problems. 


Тномрвох M. LITTLE 
Associate Director of Library Services 
Hofstra University 
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AMERICAN. DOCUMENTATION 


INSTRUCTIONS TO AUTHORS . 


American Documentation is a publication of the Ameri- 
can Documentation Institute. It is a scholarly journal in the 
various fields in documentation and serves as a forum for 
discussion and experimentation. Papers already published or 
in press elsewhere are not acceptable. For each proposed 
contribution, one original and two copies (in English only) 
should be mailed to Mr. Arthur W. Elias, Editor, Ameri- 
can Documentation, Institute for Scientific Information, 
325 Chestnut St., Philadelphia, Pennsylvania 19106. The 
manuscript should be mailed flat in & suitable-sized en- 
velope. Graphic materials should be submitted with suitable 
cardboard backing. 

Types or Manuscripts: Three types of contributions are 
considered for publication: full-length articles, brief com- 
munications of 1,000 words or less, and letters to the editor. 
.Letters and brief communications сап generally be pub- 
lished sooner than full-length manuscripts. Books, mono- 
graphs, and reporta are accepted for critical review. Two 
copies should be addressed to the Review Editor, Dr. 
T. Hines, 54 North Drive, East Brunswick, New Jersey. 


Processina: Acknowledgment will be made of receipt of 
. all manuscripts. American. Documentation employs a re- 
viewing procedure in which all mansucripts are sent to two 
referees for comment. When both referees have replied, 
copies of their comments are sent to authors with the 
Editor’s decision as to acceptability. The refereeing pro- 
cedure requires about 30 days, Authors receive galley proofs 
with a five-day allowance for corrections. Standard proof- 
reading marks should be employed. Reprint order forms are 
forwarded with galleys. 

Format: All contributions should be typewritten on white 
bond paper on one side only, leaving about 1.25 inches (or 
3 cm) of space around all margins of standard, letter-size 
(8.5 X 11 inch) paper. Double sprcing must be used through- 
out, including the title page, tables, legends, and references. 
The first page of the manuscript should carry both the first 
and last names of all authors, the institutions or organiza- 
tions with which the authors are affiliated, and notation as 
to which author should receive the galleys for proofreading. 
All succeeding pages should carry the last name of the first 
author in the upper right-hand corner (0.6 inch from the 
top) and the number of the page. 


Bryus: In general, style should follow the forms given in 
the Style Manual for Biological Journals (SMBJ), published 
for the Conference of Biological Editors by the American 
Institute of Biological Sciences (1964). 

Тече: The title should be as brief, specific, and descrip- 
. tive as possible. Vague and unrevealing titles may delay 

publication, 

“АввтвАст: An informative abstract of 200 words or less 
must be included, typed with double spacing on a separate 
sheet. This abstract should present the scope of the work, 
methods, results, and conclusions. 

ACKNOWLEDGMENTS: Financial support may be listed as 
a footnote to the title. Credit for materials and technical 
assistance or advice may be cited in a section headed 
“Acknowledgments,” which should appear at the end of 
the text. General use of footnotes in the text should be 
avoided. 

GRAPHIC MATERIALS: American Documentation requires 
finished artwork. Follow the style in current issues for lay- 
out and type faces in tables and figures. A table or figure 
should be constructed so as to be completely intelligible 
without further reference to the text. Lengthy tabulations 
of essentially similar data should be avoided. 

Figures should be lettered in black India ink. Charts 


drawn in India ink should be so executed throughout, with 
no typewritten material included. Letters and numbers ap- 
pearing in figures should be distinct and large enough so 
that no character will be less than 2 mm high after reduc- 
iion. A line 0.4 mm wide reproduces satisfactorily when 
reduced by one-half. Graphs, charts, and photographs should 
be given consecutive figure numbers as they will appear in 
the text; however, figure numbers and legends should not 
appear as part of the figure, but should be typed double 
spaced on a separate sheet of paper. Each figure should be 
marked lightly on the back with the figure number, author’s _ 
name, complete address, and shortened title of the paper. 

For figures, the originals with two clearly legible repro- 
ductions (to be sent to referees) should accompany the 
manuscript. In the case of photographs, three glossy prints 
are required, preferably 8 X 10 inches. 

ORGANIZATION: In general, papers should state the back- 
ground and purpose of the study, followed by details of 
methods, materials, procedures, and equipment. Findings, 
discussion, and conclusions should appear in that order. 
Аррепдіхев may be employed where appropriate for ex- 
tensive lists, statistics, and other supporting data. 

ВтвшовварнҮ: Accuracy and adequacy of the references 
are the responsibility of the author. Therefore, literature 
cited should be checked carefully with the original.publica- 
tions. References to personal letters, abstracts of verbal 
reports, and other unedited material may be included. If 
an as-yet-unpublished paper would be helpful in the evalua- 
tion of а manuscript, it is advisable to make а copy of it 
available to the Editor. When а manuscript is one of a 
series of papers, the preceding member of the series should 
be included in literature cited. 

CITATION Format: 

Order: Literature cited should be sequentially numbered 
88 cited. 

Authors: Give all authors with arrangement ав follows: 

Elias, À. W., B. H. Weil, and I. D. Welt 


Titles: Give full titles of articles in English, Pu 
language of original as: (In Ger.) 
Journals: Journal titles should be given in full. 


МомоовАРН AND бешілі, Dara: Should be presented in 
order as follows: Volume, issue number, pagination, and 
year. The issue number should be given in parentheses if 
journal pagination is not continuous from issue to issue. 
Pagination should be inclusive. Year of publication should 
be given in parentheses. Àn example is given below: 

Bishop, D., А. L. Milner, and F. ҮҮ. Roper, Publieation 
Patterns of Scientific Serials, American Documentation, 
16 (Хо. 2): 113-21 (1965). 

American Documentation is published in January, April, 
July, and October. One copy is included in the individual 
membership fee ($20.00 per year), three copies in the con- 
tributing membership fee ($100.00 per year). and up to five 
copies in the sustaining membership fee ($500.00 per year). 
Nonmembers may subscribe at $18.50 per year, postpaid in 
the US. Single copies may be purchased for $4.65 each. 
Communications concerning memberships, subscriptions, re- 
prints, renewals, back issues, advertising, and changes of 
address should be sent to the American Documentation 
Institute, 2000 P Street, NW, Washington, D. C. 20036. 

American Documentation is indexed in Library Litera- 
ture, Current Contents of Space, Electronic & Physical 
Sciences, Library Science Abstracts, Science Citation Indez, ` 
Chemical Abstracts, and Documentation Abstracts. 

American Documentation is entered for second class mail- 
ing at Baltimore, Maryland. 
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Editorial 


WHAT IS THE STATE OF THE ART? 


Recent conferences and institutes directed by your Editor and Associate Editor have 
offered more-than-ample support for а feeling that І have had for а number of years. 
People have practical problems and expect help in solving them. Sophisticated theoretical 
constructs of nonexistent, ideal situations do not help those engaged in the day-to-day 
activities of special libraries and information centers. Neither do the conventional 
librarianship courses offered at most of our library schools. 


How do we help the neophyte to learn what has gone on before? The critical, authori- 
tative “state-of-the-art” review is one possible solution; the book review is another. These 
are sorely needed, particularly in the area that I have characterized as “science informa- 
tion” (Editorial, Am. Doc., Oct. 1964). After all, most of our members and readers are 
actively engaged in the development and maintenance of on-going information services, 
and they want to know how others have solved problems similar to those that they 
encounter day-by-day. 


It is very encouraging to note that within recent months articles and papers directed 
to the solutions of some of these problems have indeed appeared with increasing frequency. 
Our sister publication, Special Libraries, for example, is to be congratulated upon its 
editorial policy of encouraging such contributions and of publishing them. 


American Documentation, too, has been fortunate in receiving an increasing number 
of excellent, practical state-of-the-art papers. Excellent examples of these may be found 
in the masterful syntheses by Marguerite Fischer and John Markus in the October 1965 
issues of our journal. My students are very grateful. Our thanks, too, to Ted Hines and 
his Columbia University students for their exceptionally good book reviews. With a 
plethora of proceedings publications (e.g. Drexel TICA and American University’s 
Technology of Management Series), critical book reviews may often be the “infanticidal 
agents” for brain children refractory to editorial “birth control” methods. 


Isaac D. Wut, Pu.D. 
Associate Editor, American Documentation 
Deputy Director for Scientific 

and Technical Information Systems 
Center for Technology and Administration 
The American University 
Washington, D.C. 20006 
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А résumé of the Ames Laboratory Selective Dissemina- 
tion of Information (SDI) System is presented and its 
potential for future generation computers is discussed. 
| The system Is compared with other operational SDI 
i systems with particular emphasis on the design differ- 
_ences. The Ames Laboratory system's adaptability to 
different Input tape compositions and subject coverage 





e I. Introduction 


Systems, such as the Ames Laboratory Selective Dissemi- 
nation of Information System (SDI), are commonplace 
at many installations and they must be evalulated accu- 
rately and the results disseminated so that the design 
criteria of these and future systems are continually im- 
proved. The results described are unique because they 
convey a segment of the scientific community’s reaction 
to an operational computerized information system, while 
the designers of the programmed system are only by- 
standers measuring and evaluating these reactions. The 
system is designed with a minimum of user or document 
restrictions and adapts to individual users and source 
documents depending upon user participation. 

Section II contains a brief review of the design of the 
Ames Laboratory SDI System (1). The differences in 
design from other current awareness systems are empha- 
sized. A detailed explanation is given concerning profile 
design and the “decision-making” algorithms coupled with 
comments concerning the choice of these particular 
methodologies. Section III is a comparative study of the 
system in operation using two distinct types of documents 
and 21 profiles representing 18 individual scientists. The 
main purpose of this study was to determine the system’s 


* Work was performed in the Ames Laboratory of the U. 8. Atomic 
Energy Commission. Contribution No. 1880. 


Comprehensive Dissemination of Current Literature” 


is shown through a study of the results of 26 produc- 
tion runs made from two distinct document sources. 
A detailed analysis of the Ames Laboratory SDI System 
is made for a 40-run period in 1965, Including a dis- 
cussion of shortcomings of the system and suggested 
solutions to eliminate certain areas of "noise." 


С. R. SAGE | 


_ Institute for Atomic Research 


Iowa State University 
Ames, Iowa 


ability to adapt to different document sets through mathe- 
matical control. The 21 profiles chosen for this study 
were a segment of 35 profiles which had been employed 
in SDI since its inception, but had not undergone any 
major manual revisions. 

Section IV contains the results of the system while in 
operation during 1965. The document input used in 1965 
actually covered a six-month calendar period. The users 
and profiles are grouped loosely into the following seven 
categories: metallurgy, chemistry, experimental physics, 
theoretical physics, reactor, engineering, and mathematics 
and computers. The only reason for categorizing profile 
interests was because the display of the graphic results 
would be too bulky on an individual profile basis. We 
have made no attempt to screen out poor profiles, disin- 
terested users, or poor discipline coverage. Many of the 
profiles shown had been extensively revised and expanded 
during this period of operation. The number of docu- 
ment entries is large. One source document purchased 
from the Institute for Scientific Information, Philadel- 
phia, Pa., has a very broad coverage of more than 1,060 
scientific journals as indexed in The Science Citation 
Index 1965 and all U. S. Patents. The second source doc- 
ument (2), Nuclear Science Abstracts, has a narrower 
scope, having undergone a human selection process. How- 
ever, it covers the international literature, including 
numerous governmental reports on nuclear science and 
technology, and is perhaps the most comprehensive 
abstracting service in the nuclear science field. 
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* II. Ames Laboratory SDI Computer System 


Interest profiles consist of single words or groups of 
two to six words, each with а significance value ranging 
betwen 0 and 1 to four significant figures. Single words 
and groups of words are statistically independent in the 
matching process; words within each group imply that 
all words must be in a given document or the significance 
value of that word group is not considered. Truncation 
and extension capabilities of words and groups of words 
are available to the user. Words and groups of words 
can be negated. Negative terms are not weighted; they 
are treated as total negation. A profile is limited to 
10,000 words or word groups; a user is limited to 99 
profiles. 

It has been observed from 83 profiles now employed 
in SDI that the word composition of profiles varies 
greatly. The average size of a profile is comprised of 


79.8 words or word groups. The range in size of profiles 
varies from 2 to 1,241 words or groups of words per 
profile. Users are not restricted to follow a thesaurus, 
and they have the capability to include in their profiles 
foreign language terms, journal sources, authors, and any 
term or combination of terms they feel is pertinent to 
their interests. We encourage large profiles; however, we 
have no control over this situation. The majority of 
participants in SDI have started with comparatively 
small profiles and some feel these smaller profiles fit their 
needs. However, the majority of users continue to expand 
their profiles from week to week. Each individual is 
responsible for the performance of his profile. He bene- 
fits from the system by the amount of effort he expends 
to fully utilize the potential offered by the system. The 
system is designed with enough flexibility to accommodate 
varying needs of users without creating burdensome main- 
tenance problems. Figure 1 is a condensed example of 
a typical profile. 


*SAGE, CHARLES Re 
402RESEARCH 
USER NUMBER = 00881 
PROFILE WORD CHAR CNT PROFILE NO SIGN.VALUE 
AM-DOCUMENT 15 00881 02 • 1700 
INFORMATION 11 00881 02 • 0002 
J-ACM 05 00881 02 „6166 
SDI 15 00881 02 „6799 
НР 15 00881 02 
LUHN 15 00881 02 • 3000 
INFORMATION 11 00881 02 
RETRIEVAL 09 00881 02 » 8644 
ADAPTIVE 08 00881 02 
INFORMATION 1% 00881 02 
DISSEMINATION 13 00881 02 «6972 
FEDERAL 07 00881 02 
COUNCIL 07 00881 02 
SCIENTIFIC 10 00881 02 
А TECHNOLOGY 10 00881 02 „3000 
RELEVANCE 09 00881 02 
COORDINATE 10 00881 02 
INDEX 05 00881 02 
LINKS 05 00881 02 
ROLES 05 00881 02 • 3000 
COMMITTEE 09 00881 02 
SCIENTIFIC 10 00881 02 
TECHNICAL 09 00881 02 
INFORMATION 11 00881 02 
PROGRESS 08 00881 02 
REPORT 06 00881 02 ` . 3000 
CR 15 00881 02 
SAGE 15 00881 02 NEG. 
US 15 00881 02 
PATENTS 07 00881 02 NEG. 
ARTHUR 06 00881 02 
D 15 00881 02 
LITTLE 06 00881 02 
INC 15 00881 02 
REPORT 06 00881 02 NEG. 


Fig. 1. Ames Lab SDI user profile listing 
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In the matching process of document words and profile 
words, only the number of characters specified in the 
profile word records is matched. All words have a maxi- 
mum of 15 characters. All of the words in a document 
Entry &re used in the matching process with the excep- 
tion of а 210-word dictionary file of articles, prepositions, 
ete. In the original gystems design of our SDI system 
we could not afford to create or transcribe our own 
machine-readable documents. Consequently, we pro- 
grammed the “front-end” of the system to accept any 
type of machine-readable input. The “front-end” pro- 
gram is capable of scanning and creating individual word 
records from six various types of document formats: 





| 


-| ‘Science Citation Index, Chemical Titles, Sandia Corpora- 


‘tion Publications Accession Lists, Nuclear Science 
| Abstracts, IBM KWIC Index, and Ames Laboratory 
Publication Master File. This program was written in 
modular form to compensate for the addition or dele- 
tion of formats. ` 

We аге confronted with an upper limit due to com- 
puter memory size. This "open-ended" design feature 
has allowed the system to operate on & production basis 
in & relatively short period of timie since we have not 


. saddled our installation with preparing document input. 


We are in the delightful position of shopping for various 
machine-readable source documents of input which would 
be pertinent to our scientists’ research interest areas. 
Government, industry, etc., are now starting to make 
available various types of tape information files, which 
should broaden the scope of our current awareness com- 
puter service. Our present input consists of 6,500-7,500 
document entries per week and is running approximately 


12 tape, 20K computer. Expansion of literature coverage 
would not be an expensive item taking into account the 
broad scientific literature coverage our scientists would 
have at their disposal. 

The “decision making” for selection has been designed 
to compensate for various source documents of different 
composition (titles and authors only; titles, authors, 
sources, and abstracts, etc.). This compensation has been 
achieved through varying threshold levels for uniquely 
composed document sets at the time the summation is 
made of significance values to determine if document 
entries should be transformed into notifications. The sum 
of significance values of words and word groups within 
а given document entry unique to a particular user is 
obtained by use of the formula for the probability of the 
union of two events. If we let T' be the total proba- 
bility that the user will want & particular document entry, 
the function below progressively calculates Ту. 


О < Рк « 1 (Probability of Word or Word Group 





Significance Value) 
Ty = Тка + Рк — Рк Tx. (Summation Function) 
The threshold value mentioned above is compared with 
the sum of the significance values. If it is greater than 
the sum, а notification is not generated for a usér. 


three and а half hours per week on ап IBM. 7074/1401. 


After receiving the notification, the user specifies his 
interest in a particular document by pushing out a 
“Port-d-Punch®” option on a response card, which is 
attached to the notification containing the original docu- 
ment information. The options available to the user are: 
“OF INTEREST, DOC. REQUESTED,” “OF INTER- 
EST, DOC. NOT REQUESTED,” “IMPARTIAL, 
DON’T ADJUST PROFILE,” “OF NO INTEREST,” 
“THE USER ABOVE IS ABSENT.” The profile words 
and word groups which interacted with equivalent docu- 
ment entry words and word groups are increased, de- 
creased, or not adjusted, depending on the types of 
response options punched and their frequency distribu- 
tion within the total document set of a particular run. 

There are two distinct feedback mechanisms. A normal 
feedback function (3) involves increasing, decreasing, or 
not adjusting term significance values as demanded by 
user responses. If response cards are not returned within 
a prescribed time limit (four weeks) from the date of 
distribution, they are treated as negative or “OF NO 
INTEREST.” We have found this to be very valuable 
because distinterested users (and every similar service 
of this sort will have them) are gradually starved for 
notifications. This “cut-off” has also aided us in com- 
piling statistics on all notifications generated by the 
system. 

An abnormal feedback function (3) is also used in 
order to stabilize the system during transient fluctuations 
caused by abnormal document distribution to allow for 
interest changes and to allow for renewed or reformed 
interest changes. -All profile words and word groups 
which matched equivalent document words and word 
groups, but for which a document was not selected for 
a notification, are affected by the abnormal feedback 
function. The abnormal or slow increment function is 
based on the supposition that profile words occurring 
infrequently in document entries of a given dissemina- 
tion run should be reconsidered by the user, since the 
projected volume of notifications containing these profile ~ 
words will be relatively small. Therefore, their respective 
significance values are incremented similarly, but to a 
lesser degree (the normal increment and decrement are 
not derived by a linear function), to a positive response 
word or word group significance value. Conversely, fre- 
quently appearing profile words with low significance 
values as calculated from past responses are considered 
general terms and a proportionately small increment or 
no increment is desired. This abnormal feedback func- 
tion is designed to limit the increment range within 0 to 
half that of a normal increment; the value in the range 
determined by the inverse ratio of the frequency of the 
word or word group appearing in the document entries 
of each dissemination run. 

It is clear that each word or word group has a unique 
weight with regard to each individual user and the system 
has the natural ability to automatically analyze and 
readjust this weight under complete jurisdiction of the 
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user by his past document selections. With little effort 
the user acquires а rather sophisticated linguistic analysis 
tailored completely to the individual rather than to scien- 
tific discipline. We have allowed users the option -of 
assigning constant weights excluding the feedback mecha- 
nisms, but the few users who attempted this scheme were 
flooded with indiscriminate information and have con- 
verted to the feedback scheme. Actually, a user, who 
attempts to manually assign weights to profile words and 
word groups for selection of pertinent current literature 
from the weekly volume we are processing, will find his 
task most time consuming and impossible for utilization 
purposes. A user is prepared to weight terms in relation 
to his own field of endeavor; however, a problem of 
selectivity arises with broad-coverage source documents 
and the user’s lack of knowledge of these same terms and 
the role they play in other scientific fields. Feedback 
automatically compensates to the great degree of allowing 
both the user and the system the capabilities of handling 
many diverse interest profiles efficiently without having 
to filter the document input into disciplines beforehand 
and running specialized inhibited source documents au- 
tonomously. We feel many SDI systems in operation are 
not basically selective systems because document entries 
are preselected before computer dissemination and 
selection. 

We have found that to generate any participation en- 
thusiasm one must be prepared to offer & very compre- 
hensive literature coverage to a prospective user. If a 
current awareness service cannot honestly supplement 
the user’s present literature capabilities, the task of “sell- 
ing” the system for service will be very difficult. During 
8 six-month period of production the number of users 
has increased from 1/3 to 3/4 the total number of users 
eligible to participate in SDI (restricted to senior scien- 
tists of the Ames Laboratory). Based on two “drop-outs” 
and response cut-offs, 84% of the present users are ac- 


tively participating in SDI. One of the “drop-outs” was . 


in а very specialized area of research which SDI was not 
covering with its document input. Participation in SDI 
is strictly voluntary. | 
In the program design of the Ames Laboratory SDI 
system (14 production programs) we attempted to make 
the system “open-ended.” There is no theoretical limit 
to the actual size of a document, and the upper limit of 
words and word groups for profiles indicates no hindrance 
to the user. There are practical limitations to the system 
which are dictated by computer configurations. Our SDI 
system. is dependent on the sorting capabilities of the 
computer; however, this inhibitor is not reflected theo- 
retically to volumes but to efficiency of computer run- 
ning time. The important point is that this system with 
its present basic systems design has the potential of being 
universal enough for future generation computers. The 
Systems adaptability to various sources of input certainly 
: generates enthusiasm over the advent of effective optical 
. scanners for application beyond projected routine storage 
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applications. The systems adaptability to individuals 

rather than to disciplines or specific research areas 
lends itself to the feasibility of centralized nonspecialized 
current awareness centers. On-line audio response mecha- 
nisms should replace punching response holes; visual dis- 
play consoles will replace printed notifications; retrospec- 
tive search capabilities will retrieve reference and cited 
articles; transformation of diagrams, formulae, photo- 
graphs, etc. into machine-readable form will expand . 
profile capabilities; and any number of exotic improve- 
ments can easily be incorporated into this basic system.t+ 


e TIL. Statistical Measurement of Document 
Adaptability 


'The purpose of this study was to measure the relative 
effectiveness of the Ames Laboratory SDI computer sys- 
tem to adapt to differently composed source documenta 
by variance of threshold values. Twenty-one profiles 
which had not undergone major manual revisions were 
picked for this study employing the statistics of 26 pro- 
duction runs. Profiles were chosen strictly on the above 
criteria; they were not filtered out because of poor design 
or because they were functioning incorrectly. All 21 pro- 
files had been run in a pilot study prior to production. 
The significance values of the words and word groups 
had been partially adjusted as a result of feedback from 
the pilot study which involved 9,428 document entries. 
Structuring and word composition of these profiles are 
quite diversified as shown in Fig. 2. 

Corresponding to Fig. 2 are brief résumés of the areas 
of interest of the individual scientists who compiled the 
twenty-one profiles. The differentiation of research in- 
terests is quite evident and it is important to remember 
the system was performing a service during these 26 runs 


and the statistics. compiled are the candid responses of 


these 18 scientists. 
The source documents scanned for this study were 13 
semi-monthly issues of Nuclear Science Abstracts and 18 


. weekly issues of Science Citation Inder. Nuclear Science 


Abstracts tape records contained authors, corporate 
author, title, source, and keywords of Volume 19, Num- 
bers 1, 4, 8-17, 1985. Science Citation Index tape records 
contained authors, titles, and sources of "weeks 33—45, 
1965. There was а. definite overlap in journal. article 
coverage; Nuclear Science Abstracts differed by contain- 
ing USAEC research and development reporte; Science 
Citation Index differed by containing all U. 8. patents. 
There were certain prejudices held by the-scientists in 
relation to these source documents. In a recent survey (4) 
of all our users we found that 43% would prefer to be 
dropped from Nuclear Science Abstracts coverage, and 
2% would prefer to be dropped from the Sciencé Citation 
Index coverage. We could ‘not ‘compensate for these 
discrepancies. 


t To be subjected to detailed ехрег\тейїа1 коле 





USER/S NAME 


USER-A 
USER-B 
USER-C 
USER-D 
USER-E 
USER-F 
USER-G 


USER-H 


USER-I 

USER- J 
| USER-K 
' USER-L 
USER-M 
USER-NL 
USER-N2 
, USER-N3 
| USER-N^ 
, USER- 0 
USER-P 
USER-=Q 


USER-R 








NO, USERS 





NO.l WD. NO.2 WDS. NO.3 WDS. NO.4 HDS. NO.5 WDS. NO.6 WOS. 
POS. МЕС. POS. NEG. POS. NEG. POS. NEG. POS. NEG. POS. NEG. 
2101 11 117 1042 $5 9 6 
10801 28 6 68 6 2 
12501 hy 28 101 11 3 
14601 81 15 1 
16401 20 1 25 3 5 
18601 77 80 85 
20301 49 4 35 
25101 47 65 10 
31400 15 19 10 3 
31501 3 13 18 65. 7 2 
33401 3% 1 26 5 
33701 hi 16 54 s 
36101 24 27 ма 5 30 4X 
10101 8 1: 166 3 
40105 5 7 6 
40106 12 
10107 ia 20 1 1 1 
41001 9 18 1 
41601 105 1 52 19 2 
76601 136 75 29 22 3l 13 29 lk 
86601 2 15 7 L 
21 610 370 2069 95 219 31 50 7 
Fic. 2. Analyses of the 21 profiles for computer study 
Area of interest Name Title Area of interest 
Mass spectroscopy applied to USER-H Chemist Analytical chemistry 
inorganic and analytical chem- USERI Ch 8 ion chemi 
istry, corrosion chemistry, high -I | Chem. Eng. olyent extraction chemistry 
vacuum techniques, special gas USER- Chemist Physical chemistry, solid state 
handling problems, precision Metallurgist physics, physical metallurgy 
isotope abundances, absolute Physicist 1 
isotope abundances, isotope geo- - 
chemistry USER-K Physicist Theoretical physics and nuclear 
hvsi 
-Thermodynamics of liquid metal ENS 
systems, kinetics of metal halide USER-L Physicist Solid state theory 
225 reactions,’ preparation оѓ hi 22 ; 
Же зн ры gh USER M Physicist. © Nuclear physics 
` "Mechanical metallurgy USER-N Chemist X-ray and neutron diffraction 
Physical ani inorganic chem- USER-O Physicist Nuclear physics, accelerator de- 
ну. | sign 
High: temperature refractory USER-P Physicist Experimental solid state physics 
_ ceramics, high temperature вув- USER-Q Metallurgist — Rare-earth metallurgy and solid 
tems and reactions |. | | state physics, alloy theory, high 
Physical and mechanical metal- pressure physica | a, 
lurgy, oxidation USER-R Chemist Surface chemistry, electrochem-.' 


Name Title 
USER-A Chemist 
0 | -B` Chem. Eng. 
USER-C  Metalluigist: 
UBER-D. Chemist 
USER-E Ceramic Eng. 
i | ` 
USER-F Metallurgist 
USER-G  Metallurgist . 


PROFILE NO. 





Physical and chemical metal- 
lurgy 














istry and adsorption, thermo- 








dynamics, statistical mechanics 
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Figure 3 is the graphie result of the Nuclear Science 


Abstracts runs. The top of Fig. 3 is a bar graph indi- . 


cating the number of neutral (Impartial), negative (No 
Interest), and positive (Of Interest) responses from top 
to bottom, respectively, with the numeric quantity of 
positive responses printed for each run in the correspond- 
ing bar. A total of 25,378 document entries made up 
these 18 runs. Each document entry averaged 43.1 words 
and each run averaged 1925.15 document entries. The 
average numbers of responses per run were 260.54 posi- 
tive, 384.46 negative, and 218.54 neutral for & total of 
868.54. Тһе bottom line graph of Fig. 3 illustrates the 
percentage of positive responses based on the total num- 
ber of positive and negative responses for each run. We 
did not take into account, in determining these percent- 
ages, the neutral responses because we do not know how 
to classify them. Тһе line across the line graph is the 
average percentage of positive responses based on the 
total number of positive and negative responses for each 
of the 13 runs. All percentages are relative to notifica- 
tions disseminated and do not include all potential rele- 
vant documents. The average relative percentage of 
positive responses was 40.39%. 

Figure 4 is the graphic results of Science Citation Index 


20.0 28.0 


аз 2) 


18.0 


NUMBER BF RESPONSES 


10-0 0.0 


(guo 1} 


7. 


runs. А total of 69,348 document entries made up these 
13 runs. Each document entry averaged 11.2 words and 
each run averaged 6103.7 document entries. The average 
numbers of responses per run were 145.85 positive, 209.00 
negative, and 109.59 neutral for a total of 464.44. The 
average relative percentage of positive responses based 
on the total number of positive and negative responses 
was 41.10%. Figure 5 (a and b) gives the numeric aver- 
ages of the total neutral, negative, and positive responses 
for each user of the two different source documents used. 
Graphs are available of each individual’s results over the 
26-run period (5). 

A threshold value of .5 was used for Nuclear Science 
Abstracts and а threshold of .3 was used for Science 
Citation Index. In the original design of the increment- 
decrement feedback function we believed that profile 
words and word groups with significance values near the 
5 threshold level would receive the largest increments 
and decrements on the assumption that a .5 threshold 
would be used for all documents. We found through 
experimentation prior to this study that if the threshold 
were held constant, significance values of profile words 
and word groups could not adapt uniformly to document 
base changes and produce optimum results. The signifi- 
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Fra. 3. Nuclear Science Abstracts accumulated averages for comparative study 
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т 
5 126 127 24 151 153 147 264 140 159 145 
É 08/04/85 D9/11/65 09/18/65 09/25/55 mena 10/09/55 10/18/55 10/22/65 10/20/55 11/06/85 11/12/65 11/20/85 11/77/85 AVERAGE 
E DATE 
кі 
E 
| 


1 
дра 


PERCENT pos IT IVE _ 


0.0 


| 161-65-33 181-86-24 161-85-25 181-85-16 ШЕЛЛ ашыды 181-55-40 161-65-41 151-65-42 161-85-42 161-85-44 181-65-43 AVERAGE 


Fra. 4. Science Citation Indez accumulated averages for comparative study 


USER NAME PROFILE NUMBER PERCENT PLUS AVERAGE PLUS AVERAGE NEGATIVE | AVERAGE TOTAL 
| USER-A 2101 47.06 59.62 67.08 181.23 
, USER-B 10801 19.72 1.08 4.38 30.51, 

USER-C 12501 41.54 18,69 26.31 55,08 
i USER-D 14601 37.07 6.33 10,75 25.17 
' ЏБЕВ-Е 16401 36.61 20,08 31.77 68.62 
| USER-F 18601 . 35.69 8.92 16.08 32.08 
` USER-G 20301 21.95 11.08 39,38 65.15 
T USER-H 25701 1.08 17.46 22.15 62.15 
| | USER-1 31401 61.29 8.77 | 5.54 17.69 
| USER-J 31501 20.00 1.31 | 5.23 ‚17.31 
! usER-K 33401 22.58 10,92 37.46 54.77 
| USER-L 33701 53.98 24.00 20.46 53.16 
| USER-M ` 36101 37.55 14,15 | 23.54 45.31 
| USER-NI 40101 lur. 32 | 9,31 11,69 25.77 
! USER-N2 40105 65.66 5.42 2.83 9,33 
| USER-N3 0107 43.72 12.85 16.54 | 33.46 
` USER-D ^ Miool 67.97 12.08 5.69 22.71 
| USER-P 41601 23.08 ‚ 5.08 16,92 22.15 
; USER-Q 16601 50.30 12.85 . 12,69 16.77 
' USER-R 86601 24.68 1.73 5.27 9,91 


Кї. ба. Accumulated averages of 18 runs (Nuclear Science Abstracts) from February 16, 1965, through October 26, 1965 
| Я 
H 

| ' American Documentation — October 966 161 
| 


USER NAME PROFILE NUMBER PERCENT PLUS 
USER-A . 2701 37.13 
USER-B 10801 21.05 
USER-C 12501 56.02 
USER-D 14601 153.36 
USER-E 16101. 69.03 
'USER-F 18601 35.98 
USER-G 20301. 47.89 
USER-H — 25701 29.36 
USER-I 31400 55.17 
USER~J 31501 47.73 
USER-K 33h01 9.h6 
USER-L 33701 43.04 
USER-M 36101 50.76 
USER-NI 10101 77.73 
USER-N2 1,0105 33.42 
USER-N3 14,0106 . 90.91 
USER-N4 10107 70.00 
USER-0 1001 15.00 
USER-P 11601 1.89. 
USER-Q ^ 16601 8.88 
USER-R — — 86601 71.43 


AVERAGE PLUS "AVERAGE NEGATIVE AVERAGE TOTAL 
31.08 56.23 150.77 
1,85 | 6.92 11.38 
16.16 12.92 31.08 
2.23 0.77 le 5l. 
6.00 | 2.69 Uy Shp 
5,92 ; 10.54 29.69 
2.62 2.85 8.00 
5.31 | 12.77 18.46 
1.23 1.00 2.62 
1.62 | 1:77 4.23 
0.54 > 5.15 6.00 
13.08 17.31 30.62 
5.15 | 5,00 11.54 | 
12.62 | 3.62 25.38 
10.00 ; 19.92 . 30.46 
2.31 0.23 2.85 
10.17 4.62 | 16,54 
2.54 0.85 4.08 
CNTDOS а 9, 5l 18.62 
3.46 15.54 43.38 

. 1.92. 0.17 | Ң.15. 


Fio. 5b. Accumulated averages of 13 runs (Science Citation Index) from September 4, 1965, through November 27, 1965 


cance values which adapt to less descriptive source docu- 
ments will be too sensitive to more descriptive source 
documents; the end result being a number of more de- 
gcriptive source documents flooding users with document 
notifications having a low relative percentage of interest. 
Less descriptive source documents would be too discrimi- 
nate in selection resulting in a high relative percentage 


of interest. Therefore, we lowered the threshold for the. 


less descriptive source document, Science Citation Indez, 
based on numeric ratio of words, intuition, and ап “edu- 
cated” guess. From the positive percentages derived, the 
relative positive percentages differed by only 61%. 

To further verify our findings we decided to plot com- 
parative percentages of positive responses based on indi- 
vidual percentage averages rather than on total number. 
Figure 6 shows the graphie results of these findings; the 
dotted line being the average of the two averages. The 
total averages differed here by only 2.17%, Nuclear Sci- 
ence Abstracts being 48.17% and Science Citation Index 
being 50.34%. 

Other interesting figures to be noted were the close 
ratios of numeric quantities between source documents 
of the neutral, negative, and positive responses. Because 
of these ratios, some questions could be raised about the 
value of the keywording scheme used in Nuclear Science 
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Abstracts. Another interesting figure was the differenc 
in the degree of selectivity, the average number of posi 
tive responses per user in relation to the average numbe 
of document entries per dissemination run. The degre 
of selectivity for Nuclear Science Abstracts was 2.1% ani 
36% for Science Citation Index. We consider this differ 
ence reasonable because of the differences in scientifi 
coverage of both source documents. This definitely indi 
cates that the system is adapting not only to actual docu 
ment composition but also to differentials in disciplin 
coverage, which is most encouraging. ` 


• IV. Results of Ames Lab SDI, 1965 


The sources of documents used over the 40-run регіо 
described here were the same as described in Section II] 
Only 18 Nuclear Science Abstract runs were made durin, 
1965 out of а possible 26 runs which might have bee! 
made. This particular tape service is in only its pilo 
phases and many problems arise in its preparation Мо) 
result in incomplete usage of this source document. Al 
Nuclear Science Abstracts source documents were rui 
with & .5 threshold. 


1 
age М 7.0 








PERCENT, ов IT wg 


z.0 





5.0 





50.34 
48.17 


N8R-19-01 MSR-19-04 N82-19-O8 N&R-19-04 NGR-1]-10 МЯй-19-11 M98-18-12 MSR-19-13 МБА-18-14 МЕА-19-15 N58-19-18 М8Я-10-17 H98-19-18 AVERAGE 


161-65-31 [81-85-24 181-85-28 181-05-20 BU UN TYPE o 181-65-40 161-55-41 181-65-47 151-85-42 181-85-44 101-05-45 AVERAGE 


Fia. 6. Science Citation Indeg and Nuclear Science Abstracts relative positive percentage derived on individual averages for 
| comparative study . 


; ience Citation Index source documents were run 
weekly for a six-month period from July 1965 to January 
1986. A threshold value of .5 was used for the first three 
runs of Science Citation Index and 3 for the remaining 
24 runs. 
‘A total of 177,180 document entries was scanned 
through 40 runs involving 2,804,356 eligible terms for 
matching. Right percent of the original number of terms 
were considered “general” and were automatically deleted 
in|the scanning program. These 2,804,356 document 
2 yielded 10,317,233 matches with single profile word 
; 969,586 of these matches were used in the selec- 
k process resulting in 54,018 notifications selected for 
distribution to profile users. A breakdown of the number 
of document entries, document terms, matched terms, 
matched terms in selected documents, matched terms in 
nonselected documents, number of notifications selected, 
and number of users receiving notifications for each of 
the 40 runs is shown in Fig. 7. The category “matched 
terms in nonselected documents” refers to complete words 





or word groups matching in documents for which notifica- . 


tions were not generated. This does not include incom- 
plete word groups or terms affected by negation. 

Figure 8 (a and b) shows an analysis of the word 
structuring of all profiles as they appeared after the final 





run for 1965. The majority of these profiles had under- 
gone extensive revisions during the 40-run period and 
many hardly resembled their original state. We had an 
average of 3.6 manual revisions of profiles per week. 
These included the addition and deletion of words and 
word groups, readjustment of significance values, and, in 
some cases, complete rewording and restructuring. We 
did not keep an accurate account of these revisions and 
cannot reflect this variable in the number of notifications 
distributed or percentages of relevancy. 

Figures 9—16 present the graphic results of the number 
of notifications disseminated, represented by bar graphs 
indicating the number of neutral, negative, and positive 
responses. Below each bar is the number of users who 
received notifications for that particular run. The line 
graphs are the relative percentages of positive notifica- 
tions, the solid line represents the total number of posi- 
tive and negative notifications for each run and the dotted 
line indicates the average individual user percentages for 
each run. The dotted and solid straight lines running 
across the line graph are the corresponding total percent- 
age averages for the 40-run period. The numeric averages 
are also printed in the line graph. | 

Seven groups were chosen and the profile users were 
categorized according to their research interests into one 
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AUN DESICSNO UF посечо OF ппсечо. DF 1 aNG Or WOSeND, IF RDSet, ПЕ NU-SNC,bF UST? 
./5 PCR SHAS, РГА SOR І MATee WD.GRPSE* WO.GAPSS TE/S, SEL. e/S RTCATN 





«кун SAUN aCHFS PER ef SEL.NOSNOT TN FOR DITRReNUTIF/S 

+ * жаиз «ТІР/5 SSEL.NOUTIF*PFR RUH  »PER RUN 

. LI . . Ж Li 
---------ө--------- %---------ж--------- ж--------- .--------- .--------- 4---------- 
854-19-01 1746 71748 241152 A645 6315 5102 55 
NSA-19-04 1296 753394 211912 4155 A960 2?65 51 
NSA-1 9-68 2017 92136 341754 3145 17927 384 40 
М5А-19-09 2401 104022 297406 4979 15153 2247 47 
Ң5А-19-10 1931 81504 3943145 3844 13052 1734 56 
МЅА-19-11 2246 96392 465348 5363 15815 ~ 2305 53 
N$A-1*-12 2281 101778 424824 5835 16603 7469 55 
Ч5А-19-13 149% 63141 281739 2808 11113 1128 a1 
151-65-26 658^ 174052 145628 930 4603 112 ^1 
151-%5-27 5063 57195 123342 2011 2588 1547 49 
ASA-19-14 1780 87806 382537 3320 145413 1572 56 
191-65-28 5140 59168 118714 1949 418C 1733 54 
151-65-29 5504 62598 146251 1629 3293 1487 51 
151-65-30 ь409 75347 214511 2112 5439 2358 5% 
Ң$А-19-15 1931 50585 316955 2789 13035 1234 34 
151-64-31 5023 57713 149505 1692 3347 1619 53 
NSA-1*- 18 1831 74054 267744 2473 13735 1117 ^9 
151-65-32 4184 54191 176947 1184 ^859 | 998 51 
151-65-33 6581 75635 214423 1641 4928 1434 56 
151-65-34 5812 553800 189734 1418 4159 1219 57 
151-65-35 5635 62131 215627 1613 577< 1229 55 
151-65-35 4835 55097 11141 ^65 3753 T3 »B 
151-65-37 6341 73642 232636 1191 ^68T 1C42 ^2 
151-65-38 5841 69341 185817 052 4005 106 ок 
151-65-39 5436 51184 157884 682 3397 584 30 
М54-19-17 2058 83036 461592 3960 $966 2551 54 
181-19-40 3933 45597 135601 120 2609 601 53 
N$á-19-18 2166 83997 189544 4385 13302 1800 57 
151-85- ы 8991 P0648 282284 1921 6386 1486 51 
151-65-42 5728 66843 210842 1060 4231 825 55 
1$1-$5-43 6246 74322 214890 746 4501 625 5T 
I$1-85—-44 7323 84921 316351 1146 5628 908 36 
1$1-6$-45 - 4522 61756 275148 785 3792 565 54 
151-65-46 6286 12590 339812. 1295 5810 960 52 
151-65-47 4887 55159 251357 11% 3140 583 56 
101-65-46. 5344 61114 299058 973 4493 765 62 
151-65-49 5692 63919 290854 1011 ^512 #76 60 
151-05-50 5072 57073 270159 530 402% 675 57 
191-65-51 4921 55977 242078 563 3542 545 55 
151-65-52 5859 67564 222311 632 3938 481 5T 
Toras — 177180 2004386 10317233 аза” 240999 54018 — 2184 
мен о NO ARD AQ ыш ыыы CE 1% 
PEA AUN 4429.54 7020809 25793008 2212.2 7027.5 1350.4 54.6 


Fig. 7. Statistical results of 40 dissemination runs 
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USER/S МАМЕ PROFILE NO. 
METALLURGY-A 12501 
METALLURGY-B 3701 
METAL LURGY-C 18601 
METALLURGY-DL 20002 
METALLURGY-D2 20003 
METALLURGY~D3 20004 
METALLURGY-D4 20005 
METALLURGY-D5 20006 
METALLURGY-E 20301 
METALLURGY-F 20501 
METALLURGY-G 31501 
METALLURGY-H 76601 
TOTALS--—-NO.USERS 12 


USk*/S NAME 


CHEMI STRY-A 


PROFILE NO. 


2101 
CHEMISTRY-B 7201 
CHEMISTRY-C 7801 
CHEMISTRY-D 9101 
CHEMISTRY-E1 14601 
CHEHLISTRY-E2 14602 
CHEMISTRY-F 31501 
CHEMISTRY~G 25701 
CHEMI STRY=H 32501 
CHEMISTRY-1] 40101 
СНЕМЕ5ТАУ- 12 40102 
CHEMISTRY-13 40103 
CHEMISTRY-I4 40104 
CHEMISTRY- [5 40105 
СНЕМІЅТАҮ- [6 40106 
CHEMISTRY-I/ 40107 
CHEMISTRY- 18 40109 
CHEMISTRY~J 54001 
CHEMISTRY-K 56101 
CHEMISTRY-L]) 74501 
CHEMISTRY-L2 74502 
CHEMISTRY-L3 T4503 
CHEMISTRY-LA T4504 
CHEMTSTRY-LS 74505 
CHEMISTRY-L^ 745085 
CHEMISTRY-L? 7450? 
CHEMI STRY-M 84301 
CHEMISTRY-NI 56601 
CHEMISTRY-N2 86602 
CHEMISTRY-N3 86603 
ОТА 5-е NU.USERS 30 


NU.1 WD. NO.2 WDS. 
POS. NEG. POS. NEG. 
44 28 101 
7 14 1 ЕЈ 
п во 25 
2 3 
2 
H 2 
A 3 
1 
43 4 35 
29 24 
3 13 48 
136 15 29 22 
345 200 333 22 
NO.1 WD. NO.2 WOS. 
POS. NEG. POS. NEG. 
11 117 1042 65 
* 3 13 
32 ^ ғ 
9 15 
81 
5 
3 13 48 
47 65 
5 2 
8 1 165 
1 1 1 
4 23 
2 ы 
5 1 
12 
12 20 
13 
1 45 
64 345 104 12 
2% 16 
2 2 
2 5 
5 1 12 1 
2 
16 10 
3 l 
5 5 11 1 
2 15 
27 ^^ 
11 58. 
305 594 1923 73 


Metallurgy 


NO.3 WDS. NO.4 WDS. NU.5 WDS. Nusu #05. 
POS. AEG. POS. NEG. POS. NEG. PUS, NEG. 
11 3 
i 3 2 
1 
2 4 2 
é 
4. 
1 
6 7 2 
34 13 29 5 4 
126 13 46 6 10 
Chemistry 
н0.3 WDS. NO.4 WOS. NO.5 WDS. м0.6 М05. 
POS. NEG. POS. NEG. POS. NEG. 205. ‘NEG. 
9 6 
2 2 
2 
14 
15 1 
3 1 
64 T 2 
3 10 
12 
3 
3 
6 
T 1 1 
25 5 2 
é 1 
5 
4 3 
6 7 1 
6 
1 1 
17 2 
4 2 
^ 
3 
2 
233 17 35 Ф 


Fie. 8а. Analysis of all Ames Lab SDI Profiles 
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Kxperimentel Physics 


USER/S MAKE PROFILE ко. — NO.1 мо, *0.2 WOS, — M0.) WD3; NO.4 моб. шлш. — NOL& WDS. 
POS. MEG. POS. NEG. POS. Моб. POSS MEG. POSi МЕС. РОС. МЕС. 

EXP. PHYS-A 31501 a аз 45, 4^ 1 2 

EXP.»HY$-B 33701 41 1% 54 

EXP. PHYS-C 35101 1 M s У 

EXP. PHYS-D . Seton 24 7? m А к 1 

XP. PHYS-E зетот 36 2 n 

EXPLPHYS-F 34201 0. 1 1 2 2 

EXP. PHY$-Gl 40301 з а 39 1 џ 

EXPLPHYS—G2 40302 12 топ 

EXP. PHYS-03 40303 м 1 » 

EXPL PHYS-H 41001 ’ а 1 1 

EXP. PHYS-I 4140r 1? О “ 5 1 А 

ЕКРЪРНҮУ-Ј A1501 + 4 19 ғ Ecos 

EXP. PHYS-K 4lé01 105 1 s 19 2° 

EXPLPHYS-L 42001 з 20 2% 1 E 1 

EXP, PHYS-M 45801 1 О О 

EXPLPHYS-Kl areor 4 25 5 

EXP.PHYS-N? 47602 ^T 61 

Exe. PHYS<n3 47603 48 ті А 

РХРАРНУЗ-0 78501 2 м ^ 1 > | 

TOTALS —-WÜ.USERS 1% 411 1% „5 a 134 1 14 2 


Theoretical Physica 


USER/S МАМЕ PRORILE WO. — MO.1 WO. М0.2 WOS. NO.3 WCS. NO.4 HDS. 90.5 MOS. М0.% WOS. 
POS. NEG. POS. NEG. POS. WES. POS. MEG. ' POS. MEG. POS. МЕС. 
THÉOA PHYS ~à 33401 34 1 2% 5 
THEBA. PHYS-h 33501 2 23 2 ү. = 
тне ОЖ. PHYS- 33601 50 3 23 5 
Theon .рнҰ5-0 40201 1 13 у: 5 
THEGR PHYS-E 34801 m s 102 \ 4 
TOTALS- -NO USERS s өз > m 1 ээ 1 
Engineering 
USER/S NAME PROFILE NO. NO.) WD. WO.2 DS. NO.3 WDS, М0.4 WD$. NO.5 WDS.  — NO.& WDS. 
POS. МЕС. POS. NEG. POS, МЕС. POS. MEG. POS. МЕС. POS. NEG. 
EM INEERING-A 10801 28 4 “ 2 | 
ENGINEER МЕ 14401 20 % 25 3 > | 2 ' 
ENGINEER LWG~C 31400 15 14 1с 3 
EMG TNEERING-O 53401 “ 4 15 4 
EWE LNEERING=F 7980 за 12 э ‘ 
TOTALS—— -KQ. USERS s aam зз on 5 Fn 2 3 
Reactor 
USER/S МАМЕ PROFILE NO. NO. WD. NO.2 #05. N0.3 WOS. NO.4 WOS. C h0.5 WDS. NO.6 WOS. 
POS. NEG. POS. МЕС. POS, NEG. POS, NEG. POS. NEG. POS. МЕС. 
REACTOA-AL 50401 2 13 34 КС 
AEACTOR-A2 $0402 1 13 21 2 
REACTCA-AS $0403 2 23 n 4 
REACTCR- A4 ; 59404 2 | 1 
REACTCR-AS 50405 3 А 
REACTCR-A6 50406 4 2 
- TOTAL$-—- — -hO. USERS LI 12 se al п 


Mathemtioe and Computers 


ны 9 24 505 ge пар т 
MATH. COMPA 9101 n 25 .. 1 

MATH. COAP-3 $6101 17 3 1 

MATH. COMP-C erot ii “ 

MATM.COMA-D | 92901 А D 


TUTALS--—--NO.USERS 4 46 123 3 1 


Fig. 8b. (Continued) ' 
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RUN TY 


or more groups. Each individual is involved in a specific 
area of research within each group. However, we thought 
some insights could be gained with the loose categorized 
grouping into scientific disciplines. A detailed study of 


salle profile performance within groups would reveal 
diverse results from SDI, but to graphically represent 
these results would be too lengthy for this article. Test 
profiles and their selected notifications were not shown 
in the graphic results, although their totals are included 
in Fig. 7. We had no means at our disposal by which to 
screen these totals, but they did not significantly distort 
the figures presented. 

Figure 9 contains the graphic results of 12 scientists 
representing the field of metallurgy. This particular 
group achieved the poorest relative percentages of the 
geven groups represented. There was & great variance in 
the number of notifications and relative percentages from 





10-34-36 1€1-98-0€ 141-94-27 ниже 141-46-13 mug MIB-(9-M. 1-00-41 151-96-« 151-94-42 141-06-44 151-54-44 141-94-44 101-94-40 191-08-48 161-96-48 141- 


т-а 1С:-96-0 79-90-62 кимы 


Fic. 9. Accumulated averages of metallurgy group for 40 runs 


one run to the next. Approximately 40% of the users’ 
profiles did not receive notifications for a given run. 
Many complaints were filed from this group concerning 
weak subject coverage of both source documents. The 
general construction of metallurgy profiles was not as 
extensive as the average profile of all groups. A unique 
factor with this group was the inclusion of many weighted 
positive journal source notations which resulted in users 
receiving all articles within a given journal. The smooth- 
ing of the percentage lines of the last 10 runs was caused 
by journal source notation significance values having been 
decremented heavily from feedback of earlier runs. 

The chemistry group possessed the largest number of 
users and the most comprehensive profiles. In Fig. 8a 
it appears that single users have many profiles when 
actually these profiles represent junior scientists and 
graduate students working under a senior scientist. The 
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Ела. 10. Accumulated averages of chemistry group for 40 runs 


numbers of notifications selected and relative positive 
percentages were evenly distributed over the 40-run 
period considering the normal fluctuations of journal dis- 
tribution and the addition of 14 new profiles. The people 
in this group did not have any complaints about weak 
coverage of subject matter, and the majority felt their 
current literature coverage had been expanded because 
of SDI. The nine small profiles of this group received 
very few notifications after their initial runs and had 
very little effect on the results of this group except in 
the calculation of positive percentages based on individual 
results. 

The third group, experimental physics, was close to the 
norm in terms of profile construction and results obtained. 
Initially, problems arose in the system in adjusting to 
Nuclear Science Abstracts and Science Citation Index 
source documents. The first 20 runs, as noted in Fig. 11, 
indicated poor results. Consequently, the users of this 
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group revamped their profiles extensively, adding a sig- 
nificant number of authors and negative journal sources. 
In this group, &uthors accounted for 53.8% of the positive 
profile terms, and 89.295 of the negative terms were 
journal sources. 'The individuals of this group received 
the most notifications per user. Two scientists .of this 
group felt the literature coverage of the two source docu- 
ments was weak, while the remaining users considered 
the coverage to be adequate. The members of this group 
seemed particularly anxious to be notified of the latest 
journal articles and had very little use for articles pub- 
lished six or more months previously. 

Figure 12 represents the graphic results of the theo- 
retical physics group. Figure 8b shows there were only 
five profiles in this group which resulted in dramatic 
fluctuations in the relative percentage lines. One of the | 
five users showed little enthusiasm for SDI and received 
very few notifications. His disinterest affected the rela- 


| 


tive percentage of individual averages. The users of this 
group did not make any manual revisions to their profiles 
after they submitted their originals. There was an indi- 
catio of improvement with the number of notifications 
diss ted and relative positive percentages during the 
last 15 runs. This improvement was attributed to the 
feedback mechanism. 

The engineering group also had only five profiles, re- 
flecting diverse engineering interests. These interests in- 
eluded chemical, ceramic, and mechanical engineering. 
The first 20 runs, Fig. 13, produced lower results than 
the second 20 runs. However, it seems obvious from the 
number of notifications disseminated that the bulk of 
pertinent scientific literature for engineering was selected 
from | Nuclear Science Abstracts source documents. 
Parallel with the theoretical physics group few manual 
revisions were made to engineering profiles and the advent 
of Science Citation Index source documents indicated 
that these profiles were not extensive enough to select 
documents on title, author, and source information only. 

Figure 14 shows the graphic results of the reactor 
group! , The performance of the reactor group profiles, 
although small in number, indicated obvious discrepancies 
in subject coverage of the source documents. Nuclear 
Science Abstracts, as an abstracting journal, is adequate 
in servicing this group with perhaps the periodic selec- 
tion ài fringe literature from Science Citation Index 
source, | documents. Because of the small number of noti- 
fications selected for these users over the 40-run period, 
it was doubtful if the significance values of the words 
and word groups reached equilibrium after the last run. 
Much of the literature covering reactor technology is 
generated by USAEC scientists and published in R&D 
report form. This indicates that Nuclear Science Abstracts 
source documents are more apropos to this particular 
group than Science Citation Index source documents. 

The last group, mathematics and computers, was the 
one biased group of the seven. Three of the users are 
directly involved with the Ames Laboratory SDI system, 
one the author of this article. From a personal stand- 
point, the notifications disseminated from Science Citation 
Index source documents were quite pertinent to his 
interests, particularly articles from noncomputer oriented 
journals. Hight articles were selected during this 40-run 
period | describing SDI systems, three of these articles 
were from noncomputer, nondocumentation journals. The 
graphic results of this group are represented in Fig. 15. 
It should be noted from the graphs that very few noti- 
fications were disseminated from the Nuclear Science 
Abstracts source documents; four of the users negated 
the complete source. 

Figure 16 shows the ‘composite graphic results of the 
seven groups. 'The relative positive percentages were 
somewhat erratic during the first 20 runs, but definitely 
smoothed out during the second 20 runs. Feedback was 
the major contributor to this improvement with manual 
profile ‘revisions contributing as the second factor. The 


percentages may seem low with the published results of 
other SDI systems; however, one must be aware that 
100% of the notifications selected and disseminated were 
used in the calculations of these percentages. The re- 
sponses to notifications returned for feedback from a 
given run were not, in most cases, received prior to the 
next run. The average cut-off period (all notifications 
prior to the cut-off date were treated as negative re- 
sponses) was 4.4 weeks from the date the notifications 
were distributed. The average degree of selectivity (the 
percentage of document entries of notifications out of 
the total number of document entries seanned) per user 
for the entire 40-run period was 0.55%. Considering the 
diverse subject coverage of our source documents, we 
were quite elated over this percentage even though we 
are not aware of pertinent documents not being selected 
for dissemination. 

The number of users representing four groups—theo- 
retical physics, engineering, reactor, and mathematics and 
computers—was small. Unfortunately we had no control 
over this situation. It was apparent one would require 
а minimum of 10 to 15 users per group to measure the 
comprehensiveness of source documents related to a par- 
ticular scientific discipline. Table 1 represents the average 
percentages and numbers of selected notifications as 
represented graphically in Figs. 9-16. 


* Conclusion 


There are various areas within our SDI system which 
cause “noise” and inefficiencies. Utilizing two sources of 
input has created a burden for our users in designing 
their profiles. Science Citation Index follows various 
computer-oriented standards for authors, sources, and 
scientific notation while Nuclear Science Abstracts is more 
inclined to follow library-type standards. This situation 
has forced our users to define words and word groups of 
a particular research interest into various word combina- 
tions (eg., C R Sage, C Sage, Charles R Sage, Charles 
Sage, Charles Russell Sage, Sage) to fully adjust to these 
input differentials. As mentioned earlier in this article, 
43% of our users have requested that they be excluded 
from Nuclear Science Abstracts coverage which un- 
doubtedly is due partially to the ambiguity of author and 
source notation and keywording. Certain editing routines 
will be incorporated into the system where algorithms and 
reasonably short table “look-ups” can compensate for 
some of these ambiguities. Having incorporated six vari- 
ous sources of input into this system we see a very critical 
need for universal standards. A subtle “noise” has been 
detected in matching authors and their respective initials 
as a two word group. With large volumes of documents 
the names normally considered unique are not unique 
and this produces many irrelevant notifications. This 
factor is one answer for the large number of neutral 
responses. Cross matches are occurring with initials and 
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Тавін 1. Accumulated Group Averages of SDI Responses. 











last names between co-authors of a particular document; 
a problem we did not forecast in designing the system. 
Compensations have been made in our “third generation” 
computer SDI system for processing author’s last name 
and initials as a single term. 

The: peak input load for sorting efficiency on our 
present computer system using about 100 profiles is 
approximately 15,000 Science Citation Index or 5,000 
Nuclear Science Abstracts entries per run. The running 
time of the computer asymptotically increases with num- 
bers of additional documents greater than the aforemen- 
tioned quantities. We have not calculated the maximum 
number of profiles for ideal running efficiency per run. 

With large volumes of. document entries which we 
process, there are still gaps in literature coverage of cer- 
tain research interests. It is most important that these 
gaps be filled if we expect to have a comprehensive cur- 
rent awareness service. In the near future Iowa State 
University contemplates making this service available to 
its scientifically oriented faculty and at that time one 
would have & better knowledge of weak coverage over 
many diversified scientific disciplines (6). 

The mathematical equations used in the feedback 
process are not the ultimate function in this application. 
We feel that the present equations are fulfilling our needs 
but there is room here for more experimentation. Inves- 
tigations will be made to possibly refine this process when 
time permits. However, we are totally convinced that the 
concept of feedback is mandatory for effective compre- 
hensive scientific information dissemination. 

The bulk of input selected and disseminated to our 
users is composed of title, author, and source records. 
One might consider this document composition a low 
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Percent Average Average Average 
Name Profile plus plus negative total 
Chemistry Cumulative 1965—19 Half 41.94 132.50 183.40 489.25 
Chemistry Cumulative 1965—2nd Half 4617 97.85 11410 316.10 
Chemistry Cumulative 1965—Total 48.04 11518 148.75 402.68 
Engineering Cumulative 1965—1st Half 3342 47.55 9475 177.15 
Engineering Cumulative 1965—2nd Half 42.62 13.85 18.65 45.10 
Engineering Cumulative 1965— Total 35.13 30.70 56.70 111.13 
Reactor Cumulative 1065—1st Half 5187 22.84 2137 58.58 
Reactor Cumulative 1965—2nd Half 2847 10.89 2737 4237 
Reactor Cumulative 1965—Total 40.91 16.87 2437 50.47 
Metallurgy Cumulative 1965—1st Half 2110 4640 173.50 304.75 
Metallurgy Cumulative 1965—2nd Half 85.15 26А5 48.80 95.95 
Metallurgy Cumulative 1965—Total 24.68 86.43 111.15 200.35 
Computer Cumulative 1965—1st Half 66.26 3.12 1.50 8.29 
-Computer Cumulative 1965—2nd Half 4731 10.35 11.58 26.47 
Computer Cumulative 1965—Total 50.66 6.74 6.56 17.88 
Experimental physics Cumulative 1965—1st Half 3422 78.80 151.45 259.75 
Experimental physica Cumulative 1965—2nd Half 40.69 93.10 185.70 244.75 
Experimental physics Cumulative 1965—Total 37 45 85.95 143.58 252.25 
Theoretical physica Cumulative 1965—18t Half 20.83 12.56 4770 73.20 
Theoretical physics Cumulative 1965—2nd Half 32.57 9.30 19.25 32.95 
Theoretical physics Cumulative 1965—Total 24.61 10.93 | 88.48 58.08 





order document definition, yet the SDI system is adapt- 
able enough to utilize these document entries: If an 
installation intends to institute an SDI system using 
relatively small numbers of documents composed of title, 
author, and source records, we recommend that they be 
KWOC (Keyword Out of Context) or KWIC (Keyword 
In Context) indexed and manually selected by users 
rather than by computer selection. From our experience 
with Nuclear Science Abstracts, keywording of docu- 
ments is expensive and impractical for the relative addi- 
tional benefits derived from SDI, particularly if it re- 
stricte numbers and coverage of documents. The utiliza- 
tion of author-derived abstracts seems to be the middle 
step for third generation computer systems. If authors 
do not write abstracts, one has another expensive prob- 
lem. Ultimately, texts or partial texts will be the ideal 
form of input for adaptive SDI. We are running limited 
experiments with texts and abstracts, and in the early 
stages of experimentation, our present SDI system can 


hypothetically adjust and perform better with this form 


of input. Hopefully, fourth- and fifth-generation com- 
puters coupled with effective optical scanning devices 
will make this economically feasible. We are side stepping 
the development of automatic abstracting and keywording 
techniques because of limited resources and our own pro- 
jection that necessary hardware will be available for text 
processing, thus making these theoretically developed 
techniques obsolete. 

Testing and experimentation of an improved SDI sys- 
tem for an IBM 360/50 computer is underway. Addi- 
tional features are being added to the system described 
in this article to include capabilities of generating quar- 
terly for each individual profile a KWOC index of “OF 
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Ела. 12. Accumulated averages of theoretical physics group for 40 runs 
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Fic. 18. Accumulated averages of engineering group for 40 runs. 
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Tia. 14. Accumulated averages of reactor group for 40 runs 
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Т” articles which were returned during the 
quarter | Producing this listing will give the user a manual 
reference to his notification card file and would perhaps 
give visual insights on how to improve his profile. 
The “ ront end” program of the 360 SDI system will 
edit various source documents into standard format and 
store them in a bulk storage device. We contemplate 
developing & retrospective retrieval system based around 
this bulk document, file, but we are not setting a high 
priority on this project. We are convinced from user 
reactions and comments that the majority of scientists 
are not excessively concerned about retrospective retrieval 
as current awareness, Their biggest concern with the 


“scientific literature explosion” is to be aware of current. 


research. Obviously scientists exploring new research 


areas would find retrospective retrieval useful; however, 
these people seem willing to tackle this searching manually 
if given a choice between current and retrospective 
retrieval. Ultimately, we hope to provide them with both 
capabilities. 

Thej new system, programmed in COBOL (Common 
Business Oriented Language), will continue to serve as 
an experimental laboratory as has the present system. 
Additibnal statistical measurements are incorporated into 
this system for “more-in-depth” studies. We know profile 
significance values are a resource of intellectual effort, 
and Mi intend to derive methodologies to tap this knowl- 
edge bank for other information-related applieations. 
Analysis of thesauri and keyword effectiveness and auto- 
matic jretrospective retrieval query building are only two 
potential areas which can be developed while providing 
a valuable service simultaneously. 

The relative percentage of interest computed as illus- 
trated, in Section IV of this article may tend to appear 
low compared with other percentages published about 
other. SDI systems. However, our only criteria for 
measuring any degree of worthwhile service is through 
the reactions and comments of our users. Two specialized 
scientafic listings (7, 8) are being compiled, one of these 
listings (8), through exclusive use of SDI, has accumu- 
lated 1,500 publications related to that specialized field 





in one year. It must be realized that this system does 
not replace existing methods or sources of literature 
searching, but rather supplements them by making more 
information accessible. 
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The Scholar and the Future of Microfilm 


The reasons for the failure of roll film and microcards 
to be fully integrated into library practices are examined 
and compared with present practices in microfiche 


In 1944 Mr. Fremont Rider (1) wrote а book about the 
exciting possibilities and potentialities of “micro-cards” 
as а solution for library growth problems. Rider noted 
(p. 92) that “We lave had coming into our research 
libraries a mere trickle of micro-materials, where our 
micro-enthusiasts had ‘hoped for, and had expected to 
have, a flood. And the reason why the flood has never 
come is the’ one: just stated; that micro-reduction has 
never yet really integrated itself into library practice.” 


Mr. Rider then went’ on to envision a micro-reduction . 


system in which the catalog card for a book carried the 
' micro-text of the book on its reverse side, mass-produced 
_and inexpensive readers freely available to every user, and 
.& circulation system. in which a duplicate micro-copy of a 
book: was given to & borrower, or, if he was willing to 
pay for it; a photostatic enlargement of the book. Mr. 
Rider argued persuasively. that the new “micro-cards” 
could, if not within a few weeks then ultra-conservatively 
within two or three years, change the trickle into, a flood. 

In the next fifteen years from 1945 to 1960 certainly 
microcards received a fair trial in practically every li- 
brary. One federal agency alone, the Atomic Energy 
Commission, made and distributed probably 20 million 
microcards before discontinuing their production alto- 
gether in July 1964. But, despite the considerable promo- 
tion and development, the dam never really broke, Just 
as roll film failed to integrate itself into library practice, 
microcards never came into general use in libraries. 

In the past several years, another microform, the fiche, 
has made its appearance in American libraries. The 


question naturally arises, will the microfiche form inte- 


grate itself into library practice, or in fifteen years will 
it have failed just as roll film and microcards did? 


* Operated by Union Carbide Corporation for the U. B. Atomic 
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systems. It is concluded that microfilm systems in the 
recently introduced form of the fiche will finally be 
integrated into library usage. 


1 


R. R. DICKISON 


Oak Ridge National Laboratory* j 
Oak Ridge, Tennessee 


One way to make a reasonable guess as to the future 
of microfiche is to look at the reasons given in the litera- 
ture for the failure of roll film and microcard systems, 
and to see if these difficulties are still present in a micro- 
fiche system. . 

While a fairly wide variety. of reasons is given for the 
previous failures, they can all be associated with one or 
the other of what appear to be the two major difficulties, 
lack of standardization and user resistance. 

In addition to Rider, others, eg., Scott (2), Tate (3), 
Riggs (4), and Piez (5), noted that lack of. standardiza- 
tion was an effective barrier to the widespread use. of 
roll film. Libraries received roll film perforated or un- 
perforated, negative or positive, 16 mm, 35 mm, or 70 
mm, and in a wide range of reduction ratios, usually 


“anywhere from 8x to 30x. It is an understatement to 
“say that this situation discouraged its use. Libraries were 


reluctant to purchase the wide variety of equipment 
needed to handle such diversity. Since libraries were 


` reluctant to purchase, equipment manufacturers could 


not produce in quantity and prices remained high, which 


made libraries reluctant to buy. 


User resistance was also an effective barrier to roll 


` film. Users did not like being forced to go to the library 
` for а reader and, once there, have difficulty in threading 


&nd using an unfamiliar machine. If these were over- 
come there sometimes remained the difficulty of locating 
а particular image оп a reel containing hundreds or even 


thousands of images. If the desired image was located, 


usually an enlarged copy was desired, but there existed 
no convenient way to get a copy. In view of all the 
difficulties it seems a little surprising that roll film even 
survived in library usage. These difficulties have now, 
however, been'overcome with the introduction of the 
cartridge reader and the several indexing devices asso- 
ciated with it for rapid frame location. Probably roll 


film m cartridges will make an impact on libraries in the 
next few years similar to the impact the microfiche form 
| making. 

of standardization was apparently not as severe 
lem with microcards, but still existed. Scott (2) 
04 the micro-opaque system that it was “better 
than troll film for filing and retrieval and . . . closer to 
being; a system than any other microform, but it is a 
Sii restricted to consultation of published material, 
co t as a means of making a few copies only, 










t to reproduce and not sufficiently controlled by 
rds.” There were at least three different sizes of 
microcards and at least five types of readers, none of 
which was inexpensive enough for libraries to make them. 
freelyl available. 

User resistance to microcards appears to have been 
ађоџђјав formidable as it was to roll film. The quick and 
easy thethods which Rider foresaw for duplicating micro- 
cards jor making enlargements from them never mate- 
rialized. More than one microcard user commented that 
all yo could do with a microcard was read it, if you were 
lucky. | (‘The failure of microcards to be integrated into 
gener | library usage can probably be traced to the 
fail to devise good and/or cheap reader-printers. 





Undoubtedly, the situation with regard to standard- . 


ization is different with microfiche than it was with roll 
film or microcards. The efforts of the federal agencies 
to achieve standardization, begun in June 1963, have 
resulted in the first edition of “Federal Microfiche Stand- 
ards,” issued in September 1965 (8), to be used by all 
federal agencies and their contractors. These standards, 
which adhere to one of the two international standards 
for midrofiohe, &ppear to have had already a substantial 
шарасы and libraries are beginning to benefit from these 
efforts.| The “elusive inexpensive portable film reader" 
(2, p. 490) is at last in sight. The ORNL library, and 
probably other libraries, has begun distribution in quan- 


tity оба simple desk-top fiche reader costing less than . 


$100 per reader. More than 100 such readers have. been 
distributed at ORNL. As has been noted (6), the cost, 
size, and simplicity of the reader is a key factor in a 
microfilm system. 

Little has been published as yet about user resistance 





to microfiche. The survey (7) of 100 librarian users оѓ. 


the ices of various federal agencies disclosed some 
dissatisfaction with microfiche, particularly with the 


quality. However, the experience of the ORNL library . 


indicates that the objections most frequently raised to 
roll film and microcards do not exist in a- microfiche 
system, апі user resistance is consequently disappearing. 





Several of the information centers have converted their 
entire files to microfiche and one is supplying duplicate 
microfiche copies of documents to its clientele rather than 
bibliographic references-to the documents, except, of 
course, for copyrighted material. A microfiche can be 
easily and cheaply duplicated and currently about 450 a 
week are being duplicated and distributed to microfiche 
users for their files. An automatic step and repeat en- 
larger furnishes hard copy quickly and cheaply if it is 
required. Users have reported no difficulties in placing 
the film in the reader, in locating the desired image, and 
in obtaining enlargements if needed. About 20 conveni- 
ently located reader-printers which provide quick enlarge- 
ments have been distributed. 

Not all of the difficulties with microfilm have been 
resolved with the microfiche. “Microfilms are microfilms 
and not the original book” (9), and this difficulty appears 
incapable of resolution. Not all of the necessary equip- 
ment for a complete microfiche system is commercially 


: available, but it is reasonable to expect that it will appear 


and that the costs of the entire system will go down. 
The quality can and probably will be improved. 

It seems not unreasonable to predict now that micro- 
film, in the form of.the fiche, will finally be integrated 
into library practice and the exciting possibilities for 
solving library growth problems, which Mr. Rider en- 
visioned over 20 years ago, are now at hand. 


References 


1. Бирев, F., The Scholar and the Future of the Research 
` lábrary, Hadham Press, New York, 1944. | 
2. Scorr, P., Advances and Goals in  Mierophotography, 
Library Trends, 8:458-492 (1960). 


.8. Tam, V. D., An Appraisal of Microfilm, тап Docu- 


mentation; 1:98 (1950). 


^4. Влвав, 7. A, The-State of Microtext Publications, Library 


Trends, 8: 379 (1960). 

5. Рт, С. T., Microfiche Standard Adopted, Special Li- 
braries, 55:390 (1964). ; 

6. Gray, D. E., Practical Experience in Microfacaimile Pub- 
lications, American Documentation, 3:58-61 (1952). 


27. SLA Government Information Services Committee, Users 


-Look at Information Centers, Special Libraries, 57: 
45-50 (1966). 


8. Fedéral Microfiche Standards, P.B. 16730. 1st Ed. Sep- 


| tember 1965. 

9. Jackson, W.. A., Some Limitations of Microfilm, Papers 
of the Bibkographical Society of America, 35:281-288 
(Fourth Quarter, 1941). 


^ American Documentation — October 1966 179 


PICS: The Pharmaceutical Information Control System 
of Merck Sharp and Dohme Research Laboratories 


The Pharmaceutical Information Control System (PICS), 
developed at Merck Sharp & Dohme Research Labora- 
tories, provides centralized control and methodology 
for a series of decentralized information areas in the 
Division. It is compatible with and instrumental in total 
data processing and analysis of research information. 
Serving as a Core Index to all information resources of 
the Research Laboratories, it also processes, stores, and 
retrieves research project information for the staff mem- 
bers for planning and retrospective searches. А register 
of all domestic and international clinical research infor- 
mation on experimental and in-line products of Merck 
& Co., Inc., is provided by the system. 

Ап eight-digit dual-faceted classification code was 
developed based on a companywide program identifica- 
tion scheme. This code of two mutually exclusive facets 
enables us to identify a product with a field of research. 
This code has been adopted for use in administrative 
planning, cost accounting, time allocation, and internal 


* Introduction 


One of the major problems facing industrial and. govern- 
mental organizations at present is the mushroom-like 
growth of many totally incompatible “mini-systems” for 
handling particular aspects of the total information pic- 
ture of that organization. Each of these “mini-systems” 
fulfills its responsibility to the organizational unit it 
serves directly, but fails to fulfill its responsibility as an 
integral part of the total information system of the 
organization. The need for compatability and integra- 
tion of systems is evident, апа must be а primary con- 
sideration of good systems design work. This paper 
describes & system which was planned to meet these 
requirements, as well as the modular growth require- 
ments of au industrial organization. 


180 American Documentation — October 1966 


reporting. The uniform use of this information code 
within the Research Division minimizes the vocabulary 
barrier between the user and information system and 
provides the system with a self-indexing device for 
internal reports. 

Incoming mail is copied and registered by the infor- 
mation center prior to transmittal to the addressee. 
Copies of all outgoing mail and intramural correspond- 
ence are directed to the center. Ап information scientist 
analyzes each document and selects the project code 
and document descriptors. This information is punched 
into 80-column cards. The documents are filed by code 
and the cards are filed alphabetically by name, term, 
and by date. Output forms include information, docu- 
ments, printouts of document citations, and reports to 
drug regulatory agencies, Continuous system evalua- 
tion results in reduced І/О time, better utilization of 
personnel, and improved user feedback and contact. 


MARGARET C. KOLB, JEROME T. MADDOCK, 
and BARBARA N. WEAVER 


Merck Sharp & Dohme Research Laboratories 
Rahway, N. J. 


The Pharmaceutical Information Control System 
(PICS) has been developed in the Merck Sharp & Dohme 
Research Laboratories to provide centralized control and 
methodology for the series of decentralized information 
areas in Rahway, New Jersey, and West Point, Pennsyl- 
vanja. It provides a significant quantity of relevant 
intramural information to various levels of managerial 
and operational employees without jeopardizing the 
security of the company’s informational resources. The 
document control established by this system serves as 
an information bridge to the complex Division-wide ac- 
tivities of project planning and data processing. Uniform 
use of a classified code, based on a program identification 
scheme, permits the accumulation of а total Project Pro- 
file by relating and correlating costs, manpower, labora- 
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tory results, and information generated during the life 
of &'research project. Thus, information has become a 
measurable commodity which сап be evaluated along 
with other products of research. The System produces & 
Core Index to the Information Resources of Research 
which includes not only resources within the Information 
Centers, but also the resources of staff scientists through 
ihe incorporation in the system of an index to individual 
specialized collections. PICS is organized on a staff level 
within the Division. It provides consultant services to 
the user groups to help solve their individual information 
network problems. 

The user group consists of chemists, biologists, engi- 
neers, pharmacists, physieians, and research and сог- 
porate management. The system input is approximately 
1,000 documents per day, composed of reports, memo- 
randa, correspondence, laboratory notebooks, regulatory 
agency submissions, prepublication manuscripts, and 
legacy files from various directors and departments. 
Published information is excluded from this system; it is 
processed elsewhere in the Division. Output may be 
either in the form of a summary report prepared by an 
information scientist, documents, references to documents, 
or oral answers to direct questions by telephone. The 
choice of low cost, simple EDP equipment for PICS was 
purposeful to maintain economy as well as efficiency. 
PICS became operational in April 1963; a description 
of the system in use earlier at Merck is contained in In- 
formation and Communication Practices in Industry (1). 
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Fic 1. PICS System 


• Classification Code 


An eight-digit, dual-faceted classification scheme has 
been designed for Division-wide project identification. 
Management utilizes the code for cost accounting, time 
allocation, and internal reporting. The Information Cen- 
ter uses the code for information storage and retrieval. 
The first four-digit facet defines а research program; the 
second four-digit facet, a serial code, identifies a product. 
This code of two mutually exclusive facets relates a 
product with its field of research. Tag facets are used as 
needed to identify combinations, formulations, and routes 
of administration of formulations. A numeric subdivision, 
when added to this project number, further specifies cate- 
gories of information and permits the retrieval of chem- 
ical, biological, marketing, or other specific aspects of a 
project or product. 


* Vocabulary Control 


The West Point and Rahway Research Information 
Centers are evaluating controlled vs. uncontrolled 
vocabulary. 

The keywords from text approach is employed in 
Research Information-West Point to test the hypothesis 
that within this controlled homogenous scientific com- 
munity of Clinical Research, the level of terminology is 
relatively constant and is self-controlled because the users 
ure also the generators of the documents. This approach 
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is being critically tested for growth of vocabulary, effi- 
ciency of retrieval, and economics of input against Re- 
search Information-Rahway. There, vocabulary is rigidly 
controlled by an intricate term and contact thesaurus- 
based system which serves a larger, more heterogeneous 
scientific community. 


* Document Processing 


Incoming mail is opened and copied оп a Xerox 914 
for the Information Center before transmittal to the 
addresses. Carbon copies of all outgoing mail and intra- 
company reports and memos are directed to the Center. 
Thus, & copy of every research document, internally or 


externally generated, is processed and stored in the Re- 


search Information Center, ensuring strict input control 
over the Division's information resources. 

АП documents are analyzed and coded by an Informa- 
tion Seientist, who writes the code directly on the docu- 
ment апа marks any appropriate additional descriptors 
directly on the text (with a nonreproducible chromatic 
pencil). For large sets of identically coded documents, 
such as form letters, the slave typewriter unit of the 
870 system is programmed “оп” at the code field. to 
generate the document code on labels simultaneously 
with punching of the registration card. This same feature 
of the 870 is used to automatically print all labels for the 
more than 5,000 folders added to the file each year. 
Recently, case report forms have been preprinted with 
the project code number to reduce processing time. Com- 
plete document description is punched into 80-column 
IBM cards with field definitions for contacts (individual 
or organization names) or terms, dates of documents, 
initials of addressees, authors or correspondents, an 

. project code. : 

For every document а date registry card is punched, 
plus any necessary term and contact cards. These cards 
ате identical, differing only in the first field (columns 
1—25) when necessary. 

The chronological “date card deck" acts as а complete 
document registry of the Information Center. Two addi- 
tional auxiliary alphabetic decks аге maintained asin- 
dexes to the individual terms and contacts for retrieval. 


1. The term deck contains all document deseriptors, 
entries in the project code thesaurus, cross refer- 
ences to preferred synonyms and preferred abbrevi- 
ations, and the terms from auxiliary collections. 

2. The contact deck contains cross references to pre- 
ferred individual or organizational identification, 
preferred abbreviations, names of individuals affili- 
ated: with organizational contacts, and names de- 
rived from the document. 


All punched cards are sight-verified against documents. 
The verifier marks the document to indicate that it has 
been completely processed and marks the cards to indi- 
cate the index into which they will be filed. 
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* Document Registration of Clinical Statements of 
Investigators and Case Reports 


Completed Statements of Investigators (U. S. Food 
and Drug Administration Forms FD 1572 and FD 1573) 
ате received in the Information Center prior to trans- 
mittal to the Staff Clinicians. Each Statement of Investi- 
gator is coded, indexed, and assigned a serial registry 
number which then becomes that investigator’s identifica- 
tion number for that specific clinical study. This number 
is used to identify all case reports submitted by this 
investigator in reporting results of his study, and is 
carried through in the computer evaluation of the study 
results. A complete Statement of Investigator descrip- 
tion, including principal Investigator’s name, date of 
Statement, registry number, initials of the responsible 
Merck clinician, and product code, is punched into IBM 
cards in triplicate, with additional cards for any descrip- . 
tor terms, and all other investigators participating in the 
study. The format of these cards is consistent with the 
format of those cards previously described. Cards are 
filed by investigator in the contact index and by product 
code and registry number in the Statement registry. 
Upon approval ‘and duplication for IND inclusion, the 
S.I. form is returned to Research Information where the 
registry card is signal-punched to- indicate that the form 
has been processed and that copies have been sent to 
F.D.A. The original 81. form is filed by investigator, 
and а copy is filed by produet code with that investi- 
gator’s case reports. 

All case reports are registered by Research Informa- 
tion upon receipt from the investigator. A serial registry 
of these reports is maintained on cards for each investi- 
gational drug. Each case report form is stamped with the 
investigator’s S.I. number and the-serial patient number. 
A copy of the registered case report is forwarded to the 
Merck Clinician. The original is retained in the Informa- 


' tion Center. From this original, one card is punched 


containing the investigator’s S.I. number, patient num- 
ber, patient name, age, sex, date of report, initials of the 
Merck Clinician, and product code. The patient name is 
restricted to four characters, utilizing the method de- 
veloped by Dr. John Tukey for abbreviations. On con- 
trolled studies (1.е., double blind, crossover, ete.), patient 
numbers from the protocol for the study are included as 
well. Manuscripts, letters, etc., which contain clinical 
commentary are also registered as case reports. A single 
alphabetic character punched into the card indicates the 
report type. The cards and documents are filed by . 
product code, 

Up-to-date case report registries and Statement of 
Investigator registries on any product are produced on 
demand for the clinical staff and for preparation of 
reports to government regulatory agencies through the 
use of the 870 Document Writing System. 


INCOMING MAIL 


XEROX 914 
original 
LOS Ld 
ng GENERA 
Mail TOR 





ANALYSIS and CODING 


| 


IBM 870 


| 


VERIFICATION 


STORAGE 


USER/ 
GENERATOR 





e Storage 


Documents are stored in macroform chronologically by 
project and code subdivision. 

Legacy: files from various directors and departments 
are stored in microform after they have been entered 
into the. Core Inventory of Total Information Resources 
of Research, which is maintained in the form of an IBM 
card deck. This microform storage eliminates the unnec- 
essary build-up of files within the Division, yet maintains 
their personal retrievability by filming them іп” essen- 
tially the form maintained by the individual or depart- 
ment. 

Laboratory notebooks are issued by the Research 





Date Term 
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ps of outgoing mail 


Contact 
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Fre. 2. Document processing 


Information Center. Completed notebooks are stored in 
the Center, in macroform, with a microform copy stored 
at another location for security. 


* Retrieval 


Search requests are normally initiated through a tele- 
phone call or а personal visit from the user to the Center. 
At the present time, approximately 50 searches per day 
are processed by information scientists. Searches аге 
done manually or mechanically, depending upon the 
nature of the request. Information, citations, and/or 
documents are retrieved through the Core Index by any 
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Ела. 8. Document processing of statements of investigator and case reports 


one or combination of the document description aspects. 

The questions range from the correct spelling of a 
name or chemical compound to the evaluation of past 
research results. 

One of the most valuable services of PICS to Merck 
management is the extremely rapid output of retro- 
spective searches on an entire research project or a specific 
aspect of a project from the resources of its information 
bank. Since the documents are filed in project-classified 
arrangement, they are pre-ordered for this use with zero 
time delay. 
` Concept searches are often initiated by scientists pur- 
suing а new approach to.a laboratory problem. Ав one 
of the possible search terms is traced from Core Index 
to its location in file, other relevant documents utilizing 
vocabulary more generic or specific are automatically 
retrieved due to the classified-document arrangement. 
Thus, the project arrangement complements the Core 
Index as a retrieval device. 

Another important output form generated by Research 
Information at West Point is the master set of case re- 
ports for the clinical portion of New Drug Applications 
(NDA’s) and periodic reports to the U. 8. Food and 
Drug Administration. The case report registry is used 
to verify the completeness of the case reports in the 
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product file and the processed clinical data. The infor- 
mation scientist meets with the responsible staff clinician 
to discuss the proposed format, arrangement, and inclu- 
sion of supporting documents other than case reports, 
and to obtain final approval for the master copy before 
it is released for duplication. 

The Registries now produced on the IBM 870 will be 
generated from magnetic tape within a short time. 


e Summary 


PICS is compatible with and instrumental in total data 
processing and analysis of research information. Serving 
as a Core Inventory to all information resources of the 
Research Laboratories, it provides research project infor- 
mation to staff members for planning and retrospective 
searches. The uniform use of the basic eight-digit code 
throughout the Division furnishes a self-indexing device 
for internal reports and minimizes the vocabulary bar- 
rier between the user and the system. A register of all 
domestic and international clinical research information 
on experimental and in-line products of Merck & Co., 
Inc., is generated by this system. 





пе operational features of PICS include the ab- 
f any work sheets for processing documents, the 
ign of subsequent source references to any docu- 
ited to provide effective bibliographie coupling, 
the automation of many clerical operations through the 
use of the IBM 870, and the modular system design 
which| offers both manual and machine retrieval capa- 
bilities and allows for the transition to more sophisticated 
equipment. Microform input and the expansion of the 
Core Index are contemplated for the future. 

Within PICS, functions are constantly undergoing 
evaluation. "Techniques, such as user interviews, retrieval- 








time counts, document-input counts, and time-lag aver- 
ages using punched-card methods, are being used for 
reduction of 1/0 time, better utilization of personnel, and 
improved user feedback and communication. The results 
of these studies will be reported in a future paper. 
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А Computer-Program System 


to Facilitate the Study of Technical Documents' 


Symbiont is a computer-and-program system for use in 
research on computer-aided study. It stores, retrieves, 
and displays documents and parts of documents. It 
“semi-automates” the taking of verbatim notes. It facili- 


* Introduction 


Тће purpose of this paper 18 to describe a system, con- 
sisting of a digital computer $ and a computer program, 
intended for exploration of man-machine interaction and 
computer assistance to man in the study of technical 
documents. 

The system provides a physical study situation that 
includes a desk, an electric typewriter,’ a display screen, 
and a light-sensitive pointer or stylus (“light pen").s 
The user of the system, whom we shall call “the student,” 
requests services and controls operations by typing com- 
mand characters or symbols on the typewriter or by 
touching illuminated areas of the display screen with 
the light pen. The computer and program system, which 
we call "Symbiont" because we hope to develop it into 
a truly symbiotic partner of the student, displays infor- 
mation to the student via the typewriter or the display 
screen. The display screen, which is a 10-inch square 
area on the face of a cathode-ray tube, represents alpha- 
numeric symbols and graphs. Whenever part of a dis- 


1 The work reported here was supported by the Council on Library 
Resources at Bolt Beranek and Newman, Inc. 

3Bolt Beranek and Newman, Inc. 

з Assistant Professor of Electrical Engineering at M.I.T. and Con- 
sultant to Bolt Beranek and Newman, Inc. 

4 At present, at Stanford Research Institute; formerly Consultant to 
Bolt Beranek and Newman, Ino. 

5 At present, at the IBM Research Center; formerly at Bolt Beranek 
and Newman, Ine. 

є Programmed Data Processor 1 (PDP-1), manufactured by the 
Digital Equipment Corporation, Maynard, Massachusetts. 

‘IBM Executive Electric Typewriter (Model В), modified for use 
with digital computer by the Soroban Engineering, Inc., Melbourne, 
Florida. 

8 The display screen and light pen are parts of the PDP-1 system, 
supplied by the Digital Equipment Corporation. 
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tates the manipulation and intercomparison of graphs. 
And it conducts searches for passages of text that con- 
tain specified words or phrases. Experience with. Sym- 
biont and plans for its improvement are described. 


DANIEL С. BOBROW,? R. Y. KAIN, 
BERTRAM RAPHAEL,* and J. C. R. LICKLIDER 


played pattern is touched by the tip of the light pen, 
the computer can tell what part was touched and when. 
The combination of computer-controlled cathode-ray 
display and computer-signaling light pen is a convenient 
and flexible arrangement for man-computer communica- 
tion. 

Symbiont is an early stage of what we hope will be a 
continuing evolution. However, a sufficient set of func- 
tions has been implemented to lead us to take stock and 
gain experience in their use before modifying existing 
functions or adding new ones. 

Inasmuch as Symbiont is an exploratory tool, for use 
mainly by students who are at the same time experi- 
menters, we have not considered it necessary to perfect 
or polish. For example, the display flickers. With the aid 
of character-generation and display-buffering equipment, 
we could achieve a steady display: the technology is far 
enough advanced to fulfill the display function well. At 
present, however, the equipment required for flicker-free 
display is expensive, and we prefer to put available 
funds into other things. Our reasoning is that, in due 
course, good, steady displays will become relatively inex- 
pensive, and in the interim we can make allowances for 
a bit of flicker. The same argument applies to text 
storage capacity, text searching rate, and production of 
permanent copy. In short, our aim has been to realize 
several interesting functions now, even though in ways 
for which certain allowances have to be made, in order 
to gain early experience in using the functions and to 
provide a basis for practical system design when advances 
in technology make it possible to implement the func- 
tions effectively and economically. 





* Operations and Functions Implemented 


А study session with Symbiont starts with the com- 
puter turned on, the basic program running, and the 
text and graphs of several technical documents already 
punched into machine-readable paper tape. The text is 
represented character-by-character in a standard alpha- 
numeric code. The curves of the graphs are represented 
numerically by coordinate values at selected pointe along 
the abscissae, and the calibrations, labels, and legend are 
represented alphanumerically in a prescribed format. 

At the beginning of his study session, the student loads 
representations of the documents he plans to study from 
an input tape into the computer memory. Then, typi- 
cally, he calls for a document and reads or scans it. He 
calls for it by typing any part of its standard bibli- 
ographic citation that specifies it uniquely—the author's 
name, for example, or 4 major part of the title, or the 
name (and perhaps volume or year) of the journal in 
which the document was published. Symbiont finds the 
specified document and presents the first screen-page of 
it. (A screen-page is about 150 words in length. Lines 
and pages have to be shorter on presently available dis- 
play screen than full lines and pages are in most docu- 


' ment-pages.) The student turns pages in the forward 


direction by hitting the space bar of:the typewriter. He 
may back up a page at a time by hitting the backspace 
key. ; | 
While reading ог scanning, the student comes upon 8 
passage that he wants to record verbatim for future 
reference—a passage he would ordinarily сору onto a 
note card. With the aid of Symbiont, he records it on 
paper tape or in the note-file part of the computer 
memory. To punch it on paper tape, he touches the 
initial printed character or characters of the passage 
with the light pen and then types "b" (for "begin"). 
Underlining thereupon appears beneath the character(s) 
touched. Then he touches the final printed character(s) 


| of the passage and types "e" (for “end”). Underlining 


thereupon appears beneath the ending of the passage, 
and immediately spreads back to the beginning. The 
passage is thus singled out for inspection by the student 
and for action by the computer. When the student types 
"p" (for "punch"), Symbiont punches the passage into 
paper tape. If the student next underlines the bibli- 
ographic-citation string that appears at the head of the 


| document, Symbiont appends the citation to the note, 


thus handling a chore that ordinarily plagues the con- 
scientious notetaker when he takes his notes and the 


| unconscientious notetaker when he tries to use his notes. 


The student can string any number of passages together 
by underlining them and punching them one at a time, 
in groups, or all at once. 

If the student prefers to note the passage in the com- 


‘puter memory instead of paper tape, he needs to specify 


a “tag” with which to retrieve it. He specifies the tag 
(before underlining the passage) by typing “t” (for 
“tag”) and then any symbol, or indeed any string of 
printing characters and spaces, terminated by a carriage 
return. He then underlines the passage and types “n” 
(for “note”). Alternatively, he can assign to the passage 


-a “label,” which is functionally equivalent to a tag, but 


specified initially by underlining a string of characters 
on the screen with the ight pen and then typing “1” (for 
"]abel"). The procedure for connecting the label to its 
passage is the same as the procedure for connecting a 
tag. Tags and labels go into a “glossary” of retrieval 
terms associated with the note file. To see what the 
glossary holds at any time, the student types "g" and 
looks at the screen. If the glossary is more than one 
page long, he turns its pages as though it were text. 

Often the student wants to retrieve notes, and some- 
times he wants to amend or combine them. То retrieve 
a note, the student types “т” (for "retrieve") and then 
types the tag or label (or if more convenient, designates 
а corresponding string of characters by underlining them 
with the light pen). In amending and combining re- 
trieved notes, the student is constrained by the present 
system to serial designation and concatenation of passages 
and subpassages. Under these constraints, editing is like 
operating a switch engine. However, it will be easy to 
introduce the operations of deletion and insertion. 

Verbatim notetaking and retrieval of notes are ad- 
mittedly minor matters. More vital is retrieval of pri- 
mary information. In the present context, since the stu- 
dent is assumed to be working with a small collection of 
documents known to be relevant to the topic under inves- 
tigation, the retrieval problem is not primarily one of 
finding documents. It is primarily one of finding pas- 
sages in documents that discuss particular ideas, pas- 
sages that are relevant to particular technical points. 
The approach of Symbiont to this problem is to auto- 
mate the scanning of text for specified configurations of 
retrieval terms. 

Symbiont carries out searches with reference to one, 
two, or three sets of retrieval terms. Each set may con- 
tain any number of terms of any length. For retrieval 
purposes, all the members of a set are assumed to be 
synonymous: Symbiont considers that it has found the 
set as soon as it finds any member of a set. Symbiont 
looks for members. of the three sets within a “neighbor- 
hood” of text. A neighborhood is n lines in length, and 
the student can set n to any value he likes. Five lines 
make a good neighborhood. EC 

Before conducting a search, the student types "t" (for 
"terms"), then types the strings of characters that con- 
stitute the alternate terms of the first retrieval set, and 
types “1” to designate this set as the first. Then the 
student types "t," the terms of the second set, and "2," 
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and finally "t," the terms of the third set, and “3.” Тһе 
three sets might be, for example: 


1 2 3 
cigarette lung cancer 
cigarettes lungs carcinoma 
cigar pulmonary 
cigars 
pipe 
pipes 
tobacco 
tobaccos 
nicotine 


. The student then decides whether he wants a passage 


(neighborhood) dealing with one of the three, two of the 
three, or all three ideas (sets), and he initiates the search 
by typing “f1,” “f2,” ог “f3”? (for “find one," ete.). 
Symbiont thereupon searches serially through the text 
until it either comes to the end or finds a neighborhood 
that meets the specifications. If it comes to the end, it 
displays "not found." If it finds & neighborhood that 
meets the specifications, it displays on the screen the 
text containing the neighborhood, showing a small amount 
of preceding text and a larger amount of succeeding text. 
The student may turn pages, copy passages, ete., in the 
way described earlier, or he may type “f1,” “f2,” or “f3” 
and have Symbiont look for another passage that also 
meets the specification. 

Although the idea-retrieval technique just described is 
primitive, it is surprisingly effective if the student is 
clever in setting up the sets of terms. Typically, the 
student starts with a loose retrieval prescription and 
tightens it as he makes his way through his collection of 
documents. 

Graphs are composed by the computer from tabulated 
data, and presented on the screen as graphs. They are 
displayed separately from text. They have keys that 
associate labels with curves; they have calibrated and 
labeled axes; and they have legends. Curves are approxi- 
mated by straight-line segments, dashed and/or dotted 
in eight patterns. A family of curves can have any num- 
ber of members, but in the present system, only one 
label. Up to eight families of curves can be superimposed. 
upon one grid. Two grids can be set side-by-side to 
facilitate comparison. If the graphs are fundamentally 
comparable but different in scale factor, the student can, 
with the aid of the light pen, expand or compress the 
scales of one or the other until the two presentations are 
directly comparable. He adjusts the length or position 
of & line segment of the coordinate frame by touching 
one of its ends with the light pen (which “picks up" the 
end-point) and then moving the end-point to the desired 
location. ЈЕ necessary, he repeats the procedure with the 
other end-pomt. The computer then rescales and relo- 
cates the entire graph. If two graphs are displayed side- 
by-side, one of them can be moved and superimposed 
upon the other, or curves сап be transferred from one 
to another. These operations facilitate synthesis of a 


188 Ameriean Documentation — October 1966 


composite pieture from results obtained by diverse 
investigators. | 

Symbiont makes it easy to modify not only the size 
of a graph but also the grid structure, the structure of 
the subdivision of the area within the graph. When it 
changes a grid, it also changes the numbers associated 
with the grid lines (ie., the numbers associated with the 
scale-calibration points). 

At the bottom of the screen, there is a display of 
numerals and control symbols. By pointing with the 
light pen to individual numerals in proper sequence, the 
student can build up any number he needs. Then, desig- 
nating with the light pen the control symbol “SCALE” 
and a seale-ealibration point he can substitute the 
assembled number for the number theretofore associated 
with the scale-calibration point. As soon as new num- 
bers have been associated with two calibration points 
on a linear axis scale, the computer substitutes new 
values at all the other calibration points on the axis. 

If he wants to change the number of grid lines that 
subdivide (say) the “pressure” scale of а, graph, the stu- 
dent points with the light pen to the control symbol 
“GRID” and then to the label “PRESSURE” and then 
to the appropriate numeral corresponding to the desired 
number of grid lines. The computer immediately redraws 
the grid, leaving the extreme grid lines unchanged, and 
substitutes the appropriate new numbers near the inter- 
sections of the new grid lines and the horizontal axis. 
With these procedures, the student may experiment 
rapidly with various frames and grids, for he need 
specify only the essential parameters of each coordinate 
system. As soon as they are specified, Symbiont develops 
the detailed pattern. 


e Evaluations and Plans for Improvement 


Our experience in using Symbiont has been limited by 
shortage of input tapes and by smallness of the computer 
memory. А semi-automatic tape-preparation subsystem 
and an arrangement for moving information automati- 
eallp between primary (соге) апа secondary (drum) 
memory are the items of highest priority in the plans for 
Symbiont П. Even on the basis of the limited experience, 
however, it seems clear to us that the functions provided 
by Symbiont 1 (the gystem thus far implemented) are 
effective as aids in technical study. The function of 
searching for ideas, as. primitive as the implementation 
is in Symbiont I, is little short of powerful. The automa- 
tion of verbatim notetaking, despite shortcomings in 
human engineering, seems capable of serving as the 
foundation for efficient personal documentation systems. 

In Symbiont I, however, too many of the graph- 
handling functions deal with frames, grids, and labels, 
and not enough deal with curves. The limitation to linear 
transformations is highly constraining. We must admit, 





therefore, that the graph-handling functions of Symbiont 
Ido little more than (a) afford convenience in the few 
parts of the over-all process of graph manipulation that 
they subsume and (b) make it seem plausible that a 
fuller set of functions (involving perhaps 10 times as 
much programming) would be truly useful. 

'The plans for Symbiont II call for the following modi- 
fications of, and additions to, Symbiont 1: 


1. А subsystem to “semi-automate” preparation of 
прш tapes of textual and graphical information. Be- 
eo per orient of the system during study does not 
depen upon how the tapes were prepared, we deferred 
work of в tape-preparation subsystem and relied upon 
manual production of input tapes. Manual production 
proved not to be satisfactory. For Symbiont 1I, we plan 
to take text mainly from monotype and linotype tapes 
and to use computer film-reading techniques in convert- 
ing graphical data to tabular form. 

2. Extension of the storage areas, confined to core 
memory and supplementary paper tape in Symbiont I, 
to the magnetic drum (22 times 4,096 18-bit words) now 
associated with the PDP-1, and perhaps also from the 
drum to magnetic tap units. 

3. Substitution of t-pen.for typewriter control of 

most operations that deal with information displayed on 
the screen. 
: 4. A descriptor-and-thesaurus system for retrieving 
documents from store. Symbiont l retrieves documents 
with the same searching system it uses in finding passages. 
(A bibliographic designation precedes each document in 
the store of text.) That will.be too slow when the store 
becomes large. 

5. A scheme for turning several or many pages at a 
time or for going immediately to a particular page speci- 
fied by page number. 

6. More reliance upon predetermined sequences of 
manipulation and less upon control characters. For exam- 
ple, to underline a segment of text, it should suffice to 
point with the light pen to an “underline” light button, 
then to the beginning of the passage, and then to the end 


of the passage. It is an unnecessary nuisance to have to 
specify “end” after having specified ‘ . However, 
streamlining the procedure in this way make it neces- 


вату to provide a way of reminding the student when 
he forgets where he is, in a sequence of operations, and 
a way of letting him linger on (or return to) a par- 
ticular operation long enough to correct a mistake in 


specifying it. 
7. Han of notes precisely as though they were 
otes will be permitted to contain graphs. 


documents. 


The note-retrieval glossary will be associated sid 106 
document-retrieval system. 
8. Acceptance of notes phrased by student. This now 


.Seems essential even though it is easy for him to record 


verbatim notes. 

9. Provision for extraction from text of individual 
words, individual phrases (delimited by punctuation 
marks), individual sentences, and individual paragraphs 
merely by pointing. It is an unnecessary nuisance to 
underline (i.e., to point to both ends of) a segment unless 
one wants to extract a sequence of characters that does 
not constitute & formal unit. 

10. Labeling of individual curves as well as of families. 

11. Labeling near the curve as an alternative to asso- 
ciating label and curve by key. | 

12. Search for more than three sets of terms, and for 
other combinations (such as 1 and 2 or 1 and 3) than 
any m of n. 

18. Storage and retrieval of the sets of terms used in 
searching text. It is not good to have to type a set of 
terms more than once, and it will be easy to store them 
for future reference. 'l'he student will be able to retrieve 
a set by typing any term in the set. Symbiont IL will 
display all the seta that contain the typed term and let 
the student select the one he wants by pointing to it. 

14. In designating parts of graphs to the program for 
action, more pointing to the parts themselves, and less 
pointing to their names. 

15. Transformation between linear and logarithmic 
coordinates. 

16. Fitting of curves (specified by type, such as sine 
exponential, and power series) to tabulated numerical 
data, and determination of goodness of fit. 

17, Weighted averaging of curves. 


The present plan is to effect the foregoing improve- 
ments, to gain further experience, and then, in proceeding 
to the third generation of study facilities, to meld them 
with arrangements, not described, to facilitate the organ- 
ization and retrieval of notes and data and the prepara- 
tion of technical papers. For further information about 


-the context of the Symbiont system, see reference (1) 


below. 


Reference 


1. Lxcxumzs, J. C. R., Libraries of the Future, M I.T. Press, 
Cambridge, Massachusetts, 1065. 
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e 


Biological Dictionary Preparation, 


Control, and Maintenance 


A description of the working processes involved in prepa- 
ration, control, and maintenance of a biological dic- 
tionary or thesaurus is given. The actual use of the 
dictionary by the abstractor-indexer is illustrated by 
sample coding sheets and examples from the dictionary 
of cross references, instructions, and scope notes. The 
use of specific authorities such as the World Health 


Control of indexing terms is absolutely essential when a 
group of subject-trained individuals is responsible for 
both storing and retrieving information. 

. The language used by the originator of the article 
being coded may not and often does not match that used 
by the indexer or the searcher (1). Communication and 
language problems involve viewpoint or class context, 
generics, and semantics (1). 


Since our indexers work with one viewpoint in mind 


(the effects of drugs on biological systems), this factor 
ig not so important as the other two, semantics and 
generics. | 

The relationship between words and their meanings 
such as synonyms, near synonyms, and homographs make 
up the semantic problem (2). 

Ап information system must show how words are 
used (3). It must also make provision for eross refer- 
ence to enable the searcher to retrieve all pertinent infor- 
mation on concepta of interest to him (4). 

Тће need for and concept of а technical thesaurus has 
been demonstrated by others (5-8). Thesauri and/or 
authority lists have been successfully created by other 
organizations (9-16), but none of these were specific 
enough or broad enough for thé needs of our group. 

Early compilations or authority lists were often called 
dictionaries rather than thesauri. The Abbott Abstracts 
biological authority list still bears the label of dictionary 
although it is fulfilling the function of a thesaurus by 
the inclusion of scope notes and cross references as 
needed. 
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Organization Classification of Diseases is cited. Ет- 
phasis is placed on firm control by one person trained 
in biological science, especially in the matter of syno- 
nyms and of family or generic entries. Without this 
control, a great many pertinent references can easily 
be lost when a search is made. Non-thesaurus, uncon- 
trolled indexing is very wasteful and unreliable. 


S. JANE WEINSTEIN 


Abbott Laboratories 
North Chicago, Ilinois 


Our methods of preparing а dictionary-thesaurus have 
proven to be reliable, uniform, accurate, and extremely 
efficient (which means that they are excellent time savers 
and as such save money). 

The Abbott Abstracts Biological Dictionary was organ- 
ized in 1960 to control the indexing terms used by the 
group of subject specialists who abstract and index the 
current published literature for the working scientists at 
Abbott Laboratories (18). Our group of subject spe- 
cialists, who number seven at the present time, include 
two Ph.D.’s, one a biochemist, the other an organic 
chemist. The other members of the group all have 
master’s degrees or the equivalent in one of the sciences. 
They include a physiologist, a pharmacologist, a zoologist, 
and two other organic chemists. 

The Biological Dictionary lists in alphabetical order 
terms, cross references, and all related or generic terms 
required when a specific concept is used, and provides 
scope notes whenever necessary to clarify the intended 
meaning of a term. 

An illustration of a term can be seen in Fig. 1. The 
term “Muscle Contraction, Cardiac/Normal-Physiologi- 
cal” requires the terms “Heart” and “CVS” (cardio- 
vascular system) to be indexed as well. The generic 
relationship is thus established and maintained. Optional 
terms that may be used by the indexer if they are dis- 
cussed in the article are appended as scope notes. The 
symbol "II" after each of these terms means “if indi- 
cated” to stress that the listed terms are optional and 











| 027053168276  *MUSCLE CONTRACTION, CARU[AC /NORMAL-PHYSIOLOGICAL/ /4/ M74107 
0270531682768 HEART /5/» CVS /5/» VENTRICLE II /5/» ATRIUM 11 /5/» М74107 
027053168276С TISSUE PREPNe ISOLATED II /5/» IN VITRO II /4/» IN SUI M74107 
0270531682760 ІІ /4/» HEART + PULSE RATE II /3/ M74107 
: 027053168276E FOR ABNORMAL CONTRACTION SEE mcART 015« INVOLVe COR, M74107 
: 027053168276F АКТ» II /420e1/ /6/» HEART RHYTHM 015.» EXPERe II /6/ MT7410T 


Ета. 1. Dictionary term, generic entry, “Muscle Contraction, Cardiac /Normal-Physiological/” 


not obligatory. A cross reference is also made to “Ab- 
normal Contraction.” 

Synonyms are carefully controlled. The scope of this 
dictionary-thesaurus is broad in that it includes all con- 
cepts needed for indexing or coding articles written about 
the effects of drugs on biological systems, with the excep- 
tion of the drugs themselves and the generic chemical 
classes related to those drugs. 

The classified concepts (the most frequently used terms 
of which are printed on coding sheets for the indexer’s 
use) are divided into 10 categories. (Fig. 2.) 

The 10 categories include two groups of terms that are 
controlled in a chemical dictionary by an organic chemist. 
These are category 1, Drugs or Compounds, which are 
written in by the indexer as he locates them in the article 


` being scanned, and category 2, the chemical classes to. 
: which the drugs or compounds belong. The remaining 


eight categories make up the Biological Dictionary. All 
except category 1 аге divided into two sections, A and В. 


Terms in section A are printed on the coding sheet be-. 


cause they are frequently used, апа are simply circled by 
the indexer. Less frequently used térms must be chosen 
from the dictionary and written in by the indexer in the 
B section of the eoding sheet. Category 1 has no А or B 
because there are no frequently used Drugs or Com- 
pounds. “Drug Actions” are noted in category 3, and 
general concepts and modifiers which are needed for the 
clarification of these concepts are included in category 4, 
which is a catch-all section. Category 4 includes Body 
Fluids, State and Sex of the living organism being dis- 
cussed, Pharmacological, Environmental, Nutritional and 
Toxicity Effects and Processes, Physical, Chemical, and 
other necessary terms used in indexing papers on pharma- 
ceutical technology. Both methods and apparatus used 
are of interest to the pharmacists who do research in 
formulation of drugs. The vocabulary to handle the 
articles on pharmaceutical technology is very specialized. 


|. Drugs or Compounds 

2. . Chemical Classes 

3. Drug Actions 

4. General Concepts and Modifiers 


5. Anatomy, System and Organs 


The remaining categories are Anatomy, Systems and 
Organs (category 5); Systems, Diseases or Disorders, 
Symptoms (category 6); Tests and Test Records (cate- 
gory 7); Routes and Types of Administration (category 
8); Micro- and Macro-Organisms (category 9); and 
Fields and Types of Studies (category 10). 

The embryotic dictionary was largely medical in con- 
tent before the 1960 organization began. The World 
Health Organization (WHO) Classification of Diseases, 
1955 Revision, was used as an authority list with certain 
modifications in the disease terminology. However, and 
most important, its classification number was retained. 
When necessary, disease concepts, recognized after the 
1955 “Revision, were added to the dictionary within the 
proper classification group. From the first, the dictionary 
also included the microorganisms that caused the dis- 
eases. Bergey’s Manual of Determinative Bacteriology 
was used as an authority list for the bacteria. Other 
disease-inducing organisms such as viruses, fungi, and 
protozoa were also included if an infectious disease caused 
by them was listed. Nonpathogenic microorganisms were 
added only as they appeared in articles being abstracted. 

We decided arbitrarily that the necessary coding terms 
to index a disease would be the WHO name, the system 
affected, and the specific organism if known. 

Figure 3 presents a tabulation listing System in one 
column, followed by “Dis.” (Disease or Disorder) in 
another. column, and “Symp.” (Symptom) in the last 
column. The disease is written in the blank space below 
the tabular listing exactly as it appears in the Biological 
Dictionary. Cerebral Embolism would be correctly in- 
dexed as “Embolism and Thrombosis, Cerebral.” In the 
tabular area a circle would be placed around “NS & ВО” 
(Nervous System and Sense Organs) and around “CVS.” 
If the disease were of organic nature, check marks would 
be placed opposite “NS & SO" and “CVS” in the column 
headed : “Dis.” On the other hand, were the disease 


6. System Disorder & Symptom 
7. Tests & Test Records 
8. Routes & Types of Administration 


9. Animals Incl. Microorganisms 


10. Fields & Types of Study 


Ела. 2. Coding sheet categories 
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(6A) SYSTEM, DIS. & SYMPT. 


System DIS. 
Allergic 
B & BFO (с 


SYMP. 














Congenital mal. 
DPC&P 

DS 

Early infancy 
ES £ 
Experimental 
Genera! 

GUS 





























Mental P & P 


_| 
cx 
MSS 
Neoplastic 
=; 














Nutr. & met. 
RS 
S & CT 




















(6B) WHO DISEASE 


Embolim Ф лоти, Солина 
Fia. 3. Disease recording in area 6 of the coding sheet 


attributable to drug administration, check marks would 
be made in the column headed "Symp." opposite *NS & 
SO" and “CVS.” By making this distinction, it is possible 
to use the ваше code for both disease and adverse reac- 
tions to drugs. 

We carry this distinction throughout the entire Bio- 
logical Dictionary. All the diseases listed have. coding 
instructions for indicating the cause (normal or drug 
induced) of illness as discussed in the article being in- 


dexed. In infectious disease, for example, “Tuberculosis,” . 


the disease name is written in the space below the tabular 
listing shown in Fig. 3 and IP (Infectious and Parasitic) 
is circled. 

The genus and class (taxonomic) of the causative or- 
ganism are indexed in area 9 as shown in Fig. 4. “Myco- 
bacterium” is written in area 9B and “Bacteria” is circled 
in 9A. 

In addition to its largely medical content, a few basic 
modifying terms were included in the original compila- 
tion of the dictionary. The dictionary is now in its fourth 
edition and has been greatly expanded with about 3,500 
terms now included. This required a total of about 7,000 
IBM cards to prepare and print on an IBM 1403 printer. 

Standardized indexing or coding instructions, to be 
discussed shortly, are provided for mechanized retrieval 
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(9A) ANIMALS INCL. MICROORG. 


Aves Guinea Pig 
Human 

Bovine Mouse 

Candida Rabbit 

Cat Rat 

Chicken Staphylococcus 

Dog Streptococcus 

Escherichia Viruses 

Fungi 

(9B) 


Fra. 4. Area 9 of the coding sheet 


of specific biological concepts, including pharmacological 
activities. 

Figure 5 is a partial page from the Biological Dic- 
tionary which shows the format we use. From left to 
right the format shows the random 12-digit number used 
for mechanized retrieval, the dictionary term itself, а 
number in slashes, / /, to identify the particular section 
of the coding sheet to be used, and the sequential number 
of the term with the proper letter prefix. 

Originally, organizing the terms for inclusion in the 
dictionary required about four to six months of the 
author’s time with the full-time aid of a clerical worker. 
At the onset, two identical 3 X 5 cards were typed for 
each term; one set of cards was alphabetized, the remain- 
ing set was arranged sequentially by class, according to 
WHO classification numbers. Pathogenic microorganisms 
were arranged alphabetically within each family or genus, 
while general modifying terms were placed in whatever 
related areas were possible; for example, all toxicity 
modifiers were grouped together. The other categories 


.were similarly grouped. 


The product of the first set of 3 X 5 cards was the 
alphabetic listing of terms; these were subsequently key- 
punched and printed to create the dictionary. 

The product of the other set of 3 x 5 cards is to be a 
classified dictionary, now in the process of being key- 





i 


316345385398  *ADMINISTRATION /4/ A227039 
3163453853988 /USED ONLY WHERE RATE OR MANNER OF ADMINISTRATION IS А22762 
316345385398C CRITICALLY STUDIED OR DISCUSSEDe/ Е.С» LONG TERM жжж A22763 


Fre. 5. Dictionary term, “Administration” 


punched. One section, the classified diseases according 
to WHO, is finished and is already in use. 

During the production of the 3 x 5 decks a great deal 
of time was spent in editing; making cross references to 
and from synonyms; and, when necessary, appending 
scope notes to explain or to limit the use of terms. 

An example of this, shown in Fig. 5, is the term 
“Administration” which refers to the administration of 
a drug. The three asterisks indicate that the modifying 
terms should be written in parentheses after the term is 
circled on the coding sheet to bring to the attention of 
the reader the specific reason why “Administration” was 
chosen as an indexing term. 

Another example is seen in the scope note following 
the term in Fig. 6: “Chronotropic /affecting the time or 
rate, especially the rate of contraction; said of nerve 
fibers that affect the rate of cardiac contraction, the 
vagus slowing, the sympathetic accelerating/ see Heart 
& Pulse Rate.” In this case the scope note explains the 
meaning of the term and then refers to the correct term 
to use in coding the concept under discussion. In many 
instances, corporate scientists, whose specialty was related 
to the specific area in which coding terms were being 
developed, were consulted for clarification (or amplifica- 
tion). For example, when vague or ambiguous terms 


related to cardiovascular-system effects were encoun- . 


tered, a pharmacologist working on cardiovascular drugs 
was consulted. The subject specialist decided that the 
term “Chronotropic,” even though used in the literature, 
was somewhat ambiguous and often loosely used. It was 
on this basis that “Chronotropic” was explained by defi- 
nition and included in the general concept of “Heart and 
Pulse Rate.” A scope note is appended to this latter 
term also. 

Throughout the entire early period of organizing the 
dictionary or thesaurus, a constant effort was made to 
use the best reference sources available as well as to 
consult with subject specialists in every instance where 
a term might be ambiguous in meaning. The choice of 
the best synonym was also done in this fashion when 
necessary. - 

То keep the dictionary manageable in size, many terms 


CHRONOTROPIC /AFFECTING THE TIME OR RATE» ESPECIALLY THE 


RATE OF CONTRACTION 


SYMPATHETIC ACCELCRATING/ 
/3/ 


mong 


THE RATE ОҒ CARDIAC CONTRACTIONs THE VAGUS SLOWING» THE 


were entered generically rather than specifically. For 
instance, it was decided not to name specific parts of the 
intestine, such as ileum or duodenum, even if they were 
specifically named in the article being indexed, but to 
code only “Intestine /large and small/.” “See” refer- 
ences were made from the unused specific terms to the 
generic term. | 

This dictionary is open-ended. New concepts can be 
added and old ones modified at any time. Because it is 
printed from a deck of key-punched cards, old cards can 
be pulled and changed or new ones can be key-punched 
and inserted whenever or wherever necessary. 

The 3x5 cards shown in Figs. 7 and 8 depict master 
dictionary cards. Figure 7 represents a new term, and 
Fig. 8 a revised term. The new term card has the entry 
“Depilatory /3/ /%/” typed on it. On the left is the 
5-digit accession number (A# 35318) of the abstract 
where the term first appeared. The number in slashes 
next to the term indicates that the term is to be entered 
in area 3 of the coding sheet. The percent sign informs 
the indexer that this term does not yet have a random 
number and is to be circled in red when written on the 
coding sheet. 

At the end of each week, completed code sheets are 
scanned by a clerk for red-encircled terms. The number 
of the abstract is-entered in the master 3 X 5 card dic- 
tionary. When 10 such numbers are entered, the term is 
assigned a random number which is the first step in 
making it machine retrievable. The term is then entered 
on the 10 master abstract cards in which it had been 
used and the random number is key-punched in the 
master search card which represents each abstract. The 
term is also retyped on a 3 X 5 card with its new random 
number and an asterisk is placed before the term to 
indicate that it has a random number and is no longer 
to be encircled in red when written on the coding sheet. 

After the term card has been checked for accuracy 8 
clerk makes a stencil on a Chiang Small Duplicator (17). 
Each holder of a Biological Dictionary receives a card. 
If the new term is a disease, a card is also made for each 
copy of the Classified Diseases Dictionary. The cards 
for the Biological Dictionary are retained in alphabetical 
order; those for the Classified Diseases Dictionary are 


с3960- 

SAID ОЕ NERVE FIBERS THAT AFFECT ©3960 
C3900¢ 

Sec HeaRT + PULSE RATE -290Uc 
C3960« 


Ес. 6. Dictionary term, “Chronotropic” 
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DEPILATORY /3/ /%/ 


А # 35318 





Fra. 7. Master 3 x 5 dictionary card, new term, “Depilatory, /3/,/%/” 






*LIPOLYTIC ACTIVITY /4/ 


102-173-229-298 


*LIPOLYTIC /3/ 






102—173-229—298 | 15315 


Fie. 8. Master 8 x 5 dictionary cards, revised term, “Lipolytie /3/" 
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filed by class number. When indexing, the abstracter- 
indexer checks both the printed version and the collection 
of cards representing new terms for the desired concept. 

Figures 9 and 10 are flow charts which describe the 
process of entering & new term (Fig. 9) or making & 
change in an old term (Fig. 10) for updating the 
dictionary. 

Ав Fig. 9 depicts, a number of clerical operations as 
well as various editing steps are required to enter & new 
term. One step, the assignment of a sequential number, 
is for the purposes of filing and relocating if cards should 
be accidentally moved from their normal position. A 
sequential number is assigned by checking a 5-place log 
table for the numbers between the two sequential num- 
bers already assigned to the neighbors of the new card. 
The colored flag instructs the key-punch operator in 
regard to punching positions. 

In the case of a change in terms, for example the 
assignment of & random number after a term has been 
used 10 times or a change in meaning, the clerical process 
is somewhat different. Two 3 X 5 cards are typed with 
the corrected version of the term. Figure 8 depicts the 
old and the new master card for the term “Lipolytic.” 


NEW TERM 


The name was changed from “Lipolytic Activity” to 
“Lipolytic,” and the coding area was moved from area 4 
to area 9. A new number appears on this term card in 
the upper right hand corner. It is the serial number of 
the random number. This number will shortly be in use 
since we are in the process of converting our file from 
cards to magnetic tape. The random number will be 
replaced by the serial number when the conversion is 
complete. As shown in Fig. 10, the clerical process is 
somewhat more complicated and involves more steps for 
revising terms than for entering a new term. Old punched 
cards must be clipped to the revised 3 X 5 card for 
Classified Dictionary key-punching as the instructions for 
this process are rather complicated. It is easier to give 
the operator the old cards to use as a pattern than to 
write new instructions for each change. A change list 
must also be made and circulated to each holder of the 
dictionaries. By making the actual correction in his own 
copy the Information Scientist is alerted to the change 
in term and will be aware of it in future coding work. 
In the Alphabetical Biological Dictionary, terms are 
in strict alphabetical order even though the term may 
involve more than one word. Experimental neoplasms 


TYPE 
DEFINE TERM 3X 5 CARDS D EDIT 
| ASSIGN 
KEYPUNCH ATTACH FLAG SEQUENTIAL NO. 
| INTERFILE 
VERIFY EDIT ІВМ CARDS 





VY 


INTERFILE 


3 X 5 CARDS 





Ела. 9. Procedure to enter new term in dictionary 
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REVISED TERM 


PULL TYPE 
REVISE TERM D 3 X 5 CARDS D REVISED CARDS D EDIT D ATTACH FLAG 


V 


PULL TYPE WEEKLY 
VERIFY а KEYPUNCH ч IBM CARDS 4 EDIT (] CHANGE LIST 


<> 

INTERFILE 
ESI к ы 
<> 


INTERFILE 
3 X 5 CARDS 


A 


REVISE 
DICTIONARIES 


Ела. 10. Procedure to revise term in dictionary 


with a number as a whole or part of their name are 
listed at the beginning of the dictionary. These numbers 
are arranged in order of increasing numerical sequence 
irrespective of commas or other punctuation marks. 
When two or more numerical sequences are identical, 
letter designations anywhere in the sequence are used 
as а secondary order of listing. 

Dictionary entries are followed by obligatory and/or 
optional terms describing the biological or pharma- 
cological activities reported in the Abbott Abstracts. The 
descriptive terms for the diseases are followed in every 
applicable case by the disease classification number as- 
signed to each by WHO. Obligatory coding terms follow 
the dictionary entries. These terms must be coded when 
the dictionary term is coded in order to preserve “family” 
relationships. When the term “Chronic Toxicity” (Fig. 
11) is coded the obligatory term is “Toxicity.” Optional 
terms that the indexer may choose are “Pharmacology” 
or “Clinical Pharmacology.” The obligatory use of the 


term “Toxicity” permite us to search generically for all 
abstracts in which a toxic effect is discussed. The specific 
terms so generalized in this case include “Chronic Tox- 
icity,” “Acute Toxicity,” and “Subacute Toxicity.” Every 
optional term is followed by the symbol "IL" If a se- 
quence of optional terms is surrounded by dollar signs and 
апу of the individual terms are used by the abstraoter- 
indexer, all the terms within the dollar signs must be 
coded. ; 

We update biological and pharmacologieal activities as 
more information appears in the literature, and thus we 
add additional sequences as new concepts are reported. 
'The article being coded determines which sequence is the 
best to use. The possible sequential sets provided in the 
dictionary are designed to describe specific and generic 
concepts. 

An example of possible sequences from which the in- 
dexer must choose are shown in Fig. 12 under the term 


036199243348  *CHRONIC TOXICITY /4/ | C33428 
03619924%33%88 TOXICITY /&/* PHARMACOL. II /10/» CLINICAL PHARMACOL. C39429 
036199243348C II /10/ 039424 


Fia. 11. Dictionary term, obligatory entries, “Chronic Toxicity” 
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039139298315 

03911392983158 
039139298315C 
0391392983150 
039 852983186 


*MUSCLE RELAXANT» SMOOTH /3/ : 
6 ANTICHOLINERGIC ІІ» DIRECT-ACTING /3/ II $$ АМТІ- 
CHOLINERGIC ІІ» DIRECT-ACTING /5/ II» ATROPINE-LIKE II » M7550% 
$ INTERNEURONAL BLOCKING AGENT,» CNS DEPRESSANTs ANTI- 
CHOLINERGIC ІІ» DIRECT-ACTING /3/ II $ 


M75504 
M73504 
M7550% 
M75530% 


Fic. 12. Dictionary entry, Sequence Choice, “Muscle Relaxant, Smooth" 


Н 


мв Relaxant, Smooth." Each of the three sequences 
is shel ide in dollar signs. 

To вш rize this brief description of the preparation, 
controli and maintenance of Abbotts Biological Dic- 
tionary, 1% can be stated that our experience has shown 
that firm control of any dictionary or thesaurus must be 
maintained. When reliance is placed solely on indexing 
terms chosen from the article being indexed, it is inevit- 
able that a loss of pertinent material will occur when a 
search is\being made. If no thought is given to family 
or genetic entries each time one of the family is being 
indexed,|the loss will be even greater. Non-thesaurus, 
uncontrolled indexing is very wasteful and unreliable; 
thereforej|one should give careful consideration to the 
format of|& thesaurus or к before adapting it 
for use. 

'The ap endix shows the coal marks used to facilitate 
instructions to indexers and to accommodate IBM print- 
outs as well as acceptable abbreviations. 


Appendix | 


irks are used to facilitate instructions and to 
IBM printout. 


Special 


accommoda 
Part ae Punctuation 
SYMBOL 


/%/ | 


DESCRIPTION 


Term does not have a random 
number; encircle in red on the 
| coding sheet. 


'Term has & random number. 
| (А single asterisk indicates that 
& term has & random number. 


Two asterisks means that the- 


| term has в random number and 
| is also generie, e.g., *Mycobac- 
| terium, **Bacteria.) 
Tem Three asterisks following а 
term mean that explanatory 


| notes are to be inserted in 
| parentheses after the term. 


/1/, /2/, 131, }А/, 151, 
/6/, /7/, 187, /10/ 


/ 4 


$ $ Enclosing coding sequences 
and unique instructions. 


Designation of appropriate 
areas on coding sheet. 


Parentheses or brackets. 


Part $—Abbreviations. The following accepted abbrevia- 
tions are used in the dictionary. 


ABBREVIATION DESCRIPTION 
A# Abbott Abstract Number 
ANS Autonomic Nervous System 
B&BFO Blood & Blood Forming Organs 
CNS Central Nervous System 
CVS Cardiovascular System 
CONGENITAL MAL. Congenital Malformations 
DPC E&P Deliveries, Pregnancy and 

Childbirth, and Puerperium 
DS Digestive System 
DIS Disease or Disorder 
ECG Electrocardiogram 
EEG Electroencephalogram 
EN ` Endogenous 
ES Endocrine System 
EX Exogenous 
GS |, | Genital System 
GUS Genito-Urinary System 
ІГ: К Indieated 
LA. Intra-Arterial 
ІМ. Intramuscular’ 
INHIB Inhibitor 
IP. Intraperitoneal 
IP Infective & Parasitic 
LY. Intravenous 
LIMIT Limited to 
MAO Monoamine oxidase 
MENTAL РФР Mental, Psychoneurotic & 
Personality 
MSS Musculo-Skeletal System 
MUSCLE, TEND. & Muscle, Tendon & Fascia 
FAS. 
NS & SO Nervous System & Sense Organs 
NERVES & Nerves & Peripheral Ganglia 
PERIPHER. GANG. 
NOS Not Otherwise Specified 
NUTR. & MET. Nutritional & Metabolic 
RS . Respiratory System 
S& CT Skin & Cellular Tissue 
8С Subcutaneous 
SPECIFIC Specific Term 
SYMP. . Symptom 
08 Urinary System 
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Classification Systems and Their Subjects 


A белеги Analysis of Different Kinds of Classification Systems 


Сћагасјепгед by Different Types of Subject 


The purpose of this paper is to analyze different types 
of classification systems characterized by different types 
of subjects. The most important types of subjects in 
documentation are documents and terms, but other 
systems for other subjects are discussed. Systems of 
science are dealt with as an introduction to document 
systems. Every system ought to have a structure that 
will fit the subjects. The hierarchies for documents and 
terms are not the same. If we try to classify a category 
of subjects in a system made for another category of 
subjects (for example, classifying terms or technical 








By way of introduction I may quote a statement from 
the FID/CR Conference at Elsinore in the autumn of 
1964 (1): 


The Conference listed as its aims: 


the: improvement of existing classifications, including 
work on methods for construction of thesauri and 
related tools; 

the achievement of better design in new classifications; 


the. exploration and implementation of compatibility 

among classification systems and thesauri, including 

standardized vocabularies; 

the convertibility of the records of material indexed 

in one system into another; 

and the study of the interaction between classification 

systems and computer technology in the process of 

system analysis and programming. 

'These recommendations have the great merit that they 
go deep into the core of the problem and may therefore 
be considered fruitful. Most of these points are discussed 
in this report. 

The, words “classificaton,” “classification systems,” and 
“systems” occur in all five paragraphs above. It is im- 
portant to know exactly what kind of systems are referred 
to here? 


1 Financial support by The Swedish Council for Building Research 
and The Swedish Council for Technical Research. 

з Concerning the nomenclature in the sequel, I shall consider "'classi- 
fications” and “classification systema” as synonymous. “System” is 
sometimes used in & wider sense to express contexts which cannot be 
presented as written schemes, but the word is not used in the sense 
of “information system.” 


products according to a document system) we will al- 
ways meet with difficulties. 

We should not be bound for the future to now exist- 
ing systems, but for each type of subject we have to 
create the best possible system or variant. Our aim 
should be to get a system of systems, where every indi- 
vidual system as far as possible—without distortion of its 
primary function—has common features with the other 
systems. Principles for designing a modern universal 
document system based on concepts and adapted to 
technology and a universal term system are advanced. 


EJNAR WAHLIN 
Stockholm 


With regard to the scope of the system we talk of 
universal systems, comprising all fields of knowledge, and 
specialized systems, covering only a limited field. The 
general and traditional structure of classification systems 
is the hierarchical (tree-structure), but among specialized 
systems faceted systems are a particular type. These 
differ from the traditional discipline systems not only in 
structure but also in being entirely based on concepts. 

Examples of adjectives used to denote the meaning of 
the word classification in special contexts are mentioned 
by R. Mélgaard’H (in the proceedings of the Elsinore 
Conference 1964): “informative, topological, dynamic, 
synthetic, faceted, hierarchical, factor-analytical, natural, 
arbitrary, general, special, and topographical.” 

The most important and fruitful approach seems to 
me, however, to characterize the systems according to 
the subject of the system, i.e., what the designer of the 
system wanted to arrange and classify, but very little 
attention seems to have been paid to this problem. We 
find in the literature that some experts, in writing about 
classification systems, have classification of knowledge 
concepts, or ideas in mind, others classification of docu- 
ments, while sometimes the subject is terms, things, or 
other phenomena. I shall not maintain that some of 
them are right, others wrong. In fact, different kinds of 
classification systems with different types of subjects may 
be necessary in documentation. 

The types of systems I wish to deal with are all “gen- 
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eral," in the sense that they are not restricted to а special 
subject field, and they are all of importance for docu- 
mentation. They are all types of systems in existence, 
having been devised by different professional groups for 
different practical requirements. 

I shall first discuss systems of science: several systems 
of this kind have been devised and here we have a back- 
ground to the document systems. I shall not discuss 
systems of knowledge as I do not think that there exists 
any reliable system comprising all knowledge for the 
specific purpose of classifying knowledge, but I shall deal 
with systems for documents based on their knowledge 
content. I will interpret the heading Mathematics in a 
document system as “Documents on Mathematics” and 
the subheadings as “Documents on Algebra,” etc. There 
are other systems (even if not called classification sys- 
tems) where the heading Mathematics has another sig- 
nificance, e.g, “Teaching of Mathematics” (in a time- 
table) or “text on mathematics” (in a handbook). 


Even if a document system is based on concepts, I will ` 


avoid presenting it as a system for concepts, because the 
purpose is not to classify concepts, but to classify 
documents. 

A universal system for concepts (made with the pur- 
pose to classify concepts) does not exist and is not easily 
made because there doesn’t exist any definite collection 
of concepts from which we can pick up concepts and 
arrange them. 

On the other hand concepts are usually expressed by 
terms and we can talk of systems of terms as we really 
have a definite stock of terms. The terms here are the 
subjects and should determine the structure of the 
system. | 

Also “practical systems” used in industry and com- 
merce for registration purposes wil be discussed. А 
system for products should be designed according to the 
structure of the collection of products uninfluenced by 
documentation aspects. 

The essential point in this presentation is to show how 
and why these systems have come about, how they have 
developed, what role they play or should play in docu- 
mentation, and to indicate principles on which we can 
create for every type of subject a new system designed to 
fulfill its function in the best possible way, all systems 
forming a system of systems in which every individual 
system—as far as possible without disturbing its primary 
function—has common features with the other systems. 


e 1. The System of the Sciences and the Systems 
within the Sciences 
A. SCIENCE AND SYSTEM 


According to Albert Einstein, science is the striving, 
with the aid of thought, to arrange the observable phe- 
nomena of our world into, as far as possible, a coherent 
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system, an attempt to reconstruct our existence with the 
help of the formation of concepts. The main purpose of 
documentation is to classify scientific literature and make 
it accessible. The structure (or system) of science can 
then not be without significance. 

The order between the individual sciences should not 
be a question of prestige. Only by arranging a science 
in а certain series is it possible to know what is before, 
what is within, and what is behind, and thus to get a 
definition of each science. 

The history of the system of the sciences is described 
by, among others, Bliss (2) and, in Swedish, by Allen 
Vannerus (3): The latter has made very thorough studies 
and presents a system that is still of interest. He defines 
the “systematics of the sciences” as follows: 


What must be done is to bring the sciences, this char- 

acteristic and highly important element in our spiritual 

culture, under the form of a system in order to obtain 

an insight not only into their multiplicity but also into 

their affinities. . 

In older times the natural starting point for philoso- 
phers was the human mind rather than nature. To quote 
from Francis Bacon (4): 


The parts of human learning have reference to the 
pase of man’s understanding, which is the seat of 


earning: History to his Memory, Poetry to his Imagi- 
nation, and Philosophy to his Reason. . . . Thus I have 
made, as it were, a small globe of the intellectual world. 
During the 18th century most natural sciences devel- 
oped rapidly, but the continuity was still obscure. Fossils 
of plants and animals in rock were sometimes interpreted 
by scientists as products of “Vis plastica,” 1.е., a supposed 
power of nature to reproduce living beings in stone. The 
history of creation and the chronology of the Bible were 
for most scientists a reality. Obviously the time was not 
ripe for a coherent system of science. | 
In the 19th century—in conjunction with the develop- 
ment of the theory of evolution—a picture became clear 
of the relations between the sciences which in broad out- 
line still holds today and which implied a radical reversal 
of the earlier order of sequence between the sciences. : 
This very important reversal seems riot to have been 
observed in books on the history of the system of science. 
It is moreover paradoxical that during that century, 
when it first became possible to speak of a general picture 
of the sciences which could be accepted in the whole 
western world, the interest in the system of the sciences 
began to die out both among philosophers and librarians. 
H. E. Bliss is here an exception, as de Grolier (5) points 
out: е . ; 
His principal effort was directed to one point; to draw 
bibliographic classification nearer to what he termed 
“the scientific and education consensus"—the scientific 
and pedagogic order of subjects:of study. His work 
from this standpoint has historical importance. 
A documentalist now engaged on the relation between 
the sciences is the Pole, Z. Dobrowolsky (6), who wants 
to create a new “encyclopaedic. classification.” 


his would be important, not only for the development 
f| documentation but also for the furtherance of scien- 
ic research by furnishing a basis for rational organiza- 
tion of research within all fields of knowledge and on 
international level. 


“tee 


is true that documents nowadays cannot as easily as 
before be fitted into traditional folds, since scientific 
progress has meant that different sciences are to an ever 
increasing extent woven together and make use of one 
another. But this does not signify that а system is 





В. (SYSTEMS WITHIN THE SCIENCES 


Apart from the structure of the whole field of science 
(the:system of science), most sciences have developed 


rather detailed and fixed systems or structures of their - 


own. Linne’s “Systema Naturae" is an example, perhaps 
not| quite up to date, but in his time bringing order out 
of chaos. The systems of the chemical elements, of atoms 
and ‘electrons, of the stars and the galaxies, of the 
hereditary factors, etc., are other well known systems, 
forming a basic layer in the individual sciences and in 
the whole field of science as well. 


C. MAIN BEQUENCE IN SCIENCE 


A division of the entire field of science into only three 
main groups—which can scarcely be questioned—becomes 
clear from the following: 


1! In the beginning there existed only lifeless matter 
and energy. The processes which went on are de- 


jistry, astronomy, geophysics, ete., with mathematics 

| basic science. The heavenly bodies and our earth 

lappeared. 

21 The origin of life signifies the start of a new epoch 

| in natural history with a new series of phenomena 
and concepts. Living matter, plants, and ani 

evolved. 


3.) When some higher mammals developed into man, 
still another phase starts. Language arises, nature 
is exploited, societies and cultural activities come 
into being, and the stock of concepts was rapidly 
increased. 


1 here а simple tripartite principle: 
atter (ie. inorganic matter and energy) 
Life (ie. living matter and physical life) 


Culture and society (man as an individual and in 
society) 


—— 


This, principle corresponds with that advanced by the 


British librarian, James Duff Brown (7), who at the | 


beginning of this century suggested the simple series: 
Matter—Life—Mind—Record 
Figure 1 shows two philosophical systems from the first 


half of the 19th century and the library systems of Bliss 
and Brown constructed a century later which all—by and 





‘scribed in the sciences of mechanics, physics, chem- | 


large—exhibit the sequence mentioned above. In contrast, 
thereto, UDC (and Dewey) exhibit quite another struc- 
ture representing a picture antiquated already in Dewey’s 
time, and this also applies to many library systems used 
today. 

Nowadays—with our strong trend toward specializa- 
tion—these questions are considered a little old-fashioned. 
'The field of knowledge is regarded by many documen- 
talists as а coherent mass without distinguishable dividing 
lines; or the only way to divide it is according to some 
of the well-known traditional library systems. Only very 
occasionally is а new voice heard. It was therefore a 
strange coincidence that, at the time of writing this 
paper, the author happened to read an article in & 
Swedish newspaper entitled “Science with or without 
method." Dr. Boris Tullander (8), Lecturer in Economie 
Methodology at Uppsala University, writes, though not 
with & thought for documentation: 


There are at least three areas of our existence which are 
sharply distinguished. One may call them (1) the sphere 
of the materiai—mechanical relations, (2) the sphere of 
the organic—biological relations, (3) the sphere of the 
psychological—sociological relations. These relations, 
or “causa nexi,” are geared to one another but they 
also represent a distinct advance and one immediately 
recognizes the differences. 


Coming down to a lower level the principles for further 
division are not so self-evident. Р 

А division of the sciences not consistent with that 
above is advocated by Maurice Korach (Hungary) in an 
interesting analysis (9) of the nature of the sciences: 








Pure sciences Natural sciences ^ Technical sciences, 
technologies 
Mathematics Zoology Agricultural 
Physics Botany technologies 
Chemistry Mineralogy and Industrial 
ete. petrography technologies 
Astronomy 
etc. 





This division seems to aim at the general character of 
the sciences and not at their knowledge content (eg., 
Zoology зла Botany between Chemistry апа Petrogra- 
phy). Perhaps we have to admit that the system of 
science can be regarded from different aspects. The 
documentation aspect, however, seems better justified by 
the scheme advocated above. 


D. DISCIPLINES OR CONCEPTS 


To set up a fixed and detailed scheme, with the tradi- 
tional disciplines as building blocks, has its difficulties 
even if we are agreed on the principle of their sequence. 
There have been successive changes in the composition 
of the sciences. Philosophy originally comprised nearly 
all science, but later had to relinquish bit by bit. The 
limits between mechanics, physics, and chemistry are 


American Documentation — October 1966 201 


Mathematica 
I Astronomy 


Physics 
Chemistry 


Sociology 


III 


SPENCER 


Logics 
Mathematics 


Mechanics 
Physics 

Chemistry 
Astronomy 





1) Record = Literary forms, History, Geography and Biography 
Fra. 1. Two philosophical systems from the first half of the nineteenth century and two library systems constructed a 
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century later compared with another and with UDC 
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BLISS тонн 
А. Philosophy 
General Science 
Logic, Mathematics, etc. 
B. Physics Matter 
C. Chemistry 
D. Astronomy 
including Geology, 
Geography 
E. Biology у 
F. Botany Life 
G. Zoology 
Н» Мап 
Physical 
Medicine 
I. Psychology Mind 
Social 
Education 
K. Sociology 
Ethnology 
Human Geography 
Travel and Description 
1/0. Social-Political 
History 
P. Religion and Ethics 
5. Social Welfare, 
Applied Ethics 
R. Political Science 
B. Тач 
T. Economics 
U. Useful Arts, Industries 
Trades 
ү. Fine Arts, Philosophy 
wy. Literature and Language | Record! 
Z. Bibliography 


UDC 


Generalities 


O Bibliography 
Libraries, etc. 


Philosophy 
1 Ethics 
Psychology 


Religion 


- Theology 


Social sciences 


3 Law 
Philology 
н Linguistics 


5 Pure sciences 


- Applied sciences 
6 (Medicine, Tech- 
nology) 
Arts 
7 Entertainment 
Sport 


Literature 
Belles lettres 


Geography 
9 History 


Biography 


) 


debatable: Atomic and nuclear physics have broken away 
from classical physics to form a new fundamental physics 
related to optics and the theory of electricity. Space 
science is another example and similar conditions prevail 
over thej entire science register. When we come to the 


applied sciences, it is difficult to draw a frontier with pure 
science and no generally accepted subdivision of tech- 
nology Within sociology the difficulty is still 


greater, not to speak of psychology. 

In the ‘introduction we distinguished between tradi- 
tional discipline systems and concept systems. By sub- 
dividing the total field of science according to concepts 


one ob a basis for a system free of the traditional 
bounda | ines, 
This not imply a revolution im the sequence, as 


one ean|follow in broad outline the sequence presented 
in Table 1 and thus make use of the already disentangled 
relations.; But now we are more free to follow the prin- 
ciple of|introducing new basic concepts in such а way 
that each new group сап make use of earlier concepts 
and no group need make use of concepts from subsequent 
groups.| One ean also, within the pure basic sciences, 
obtain a:close agreement with the system for units of 
measure, magnitudes, and physical “dimensions” as shown 
in Tabl | 

А more extensive sketch of the sciences and their 
objects, embracing the introductory groups, is shown in 
Table 2. 


Taste 1. The Pure Sciences, Their Magnitudes and Basic 
Units of Measure 


Units of 








Science Magnitudes! Dimension 
measure ? 
Arithmetic Number 0 — 
Algebra and Mathematical : 
analysis quantities 0 — 
Geometry Length 1 m 
Area, 12 m? 
ete. 
Chrono! ogy Time t 8 
Kinematics Velocity v= 1/6 m/s 
| Acceleration, 
etc. a=1/t? m/s? 
Statics Force 3 Fr kp or Newton * 
Moment of 
i a force Ехі. kpm 
А Pressure, : 
ete. Е/1* kp/m* 
Dynamics Mass m kg 
Kinetic 
energy, : 
ete. m X y? kg 12/62 
1 Basic magnitudes in italics. | 
2m = meter, s сз second, kg = Р ЕЕЕ (1 kilo- 
‘pond is|the gravity of 1 kg under certain conditions). 





зп ies force can be considered as a fundamental magnitude; but 
if, as the mks system, mass is a fundamental magnitude, force 
will be ‘derived by Newton's law: force == mass X acceleration (F == 
m Ха 5 

41 Моп is that force that will give the mass of 1 kg an accelera- 
dion “| m/st. 


| 


Taste 2. The Pure (Exact) and the Natural Sciences and 
Their Respective Concepts 





I. The pure and inorganic material sciences | 
А. Pure nonmaterial sciences ` Concepts 
(— exact sciences) 


Mathematics Number and mathe- 
matical quantities 
Geometry Space 1 
Chronology Time 1 
Kinematics Motion 
Statics 5 Force 
Dynamics Mechanics Mass? 
B. Pure material sctences 
Physics Matter: 
Chemistry and 
Energy 


C. Sciences of the universe and the earth 
(— inorganic natural sciences) 


Astronomy Universe 

Geophysics The earth 

Geology The solid surface of 
the earth 


The water on the 
Тасе of the earth 


Hydrology. Oceanography 


Meteorology The atmosphere 
Space science The space around ug 
D. Technology 


II. The biological sciences 
== organic natural sciences) 


А. General biology Life 
B. Botany Plants 
C. Zoology Animals 


D. Anthropology. Medicine Man 
HI. ——— 
1'The concepts space and time are not the same as in Ranganathan's 
space and time facets; they do not embrace geography and history. 


8 Mass is not to be mixed up with matter or material. Mass is ideal 
matter, characterised only by weight and inertia. 


The more dynamic the development in our stock of 

knowledge, the more important it is, in the construction 
of a system of sciences, to build on simple, permanent, 
fundamental concepts. Concepts such as number, force, 
mass, energy, heat, atom, molecule, metals, wood, elec- 
tricity, cell, heredity, sex, organs of sense, etc., stand 
through all times whatever the developments in society 
and technology. Changes in the sequence of the funda- 
mental concepts are—after Newton and Darwin—rare. 
Einstein's theories may have led to a new conception of 
the most elementary basic concepts, but this is hardly of 
practical significance for documentation. 
- For dividing the field of technology this principle is of 
importance by creating a firmer ground for classifying 
those parts of technology which constitute applied science. 
We shall come back to this question in Section 2. 

To extend the system of sciences to all areas influenced 
by the activity of man is exactly the purpose and the 
aim of science, because science is the knowledge of sys- 
tematical relations between concepts, even if we cannot 
construct a hierarchy comprising all details of science. 


American Documentation — October 1966 203 


*- 


* 2. Document Systems 


By document systems is meant systems designed for 
classification of documents, whether for arrangement on 
the shelf or for organization of card indexes or printed 
indexes, each document being classified according to its 
knowledge content. As earlier intimated we can, when 
irying to classify & certain piece of knowledge, use differ- 
ent principles for different purposes. Therefore we have 
a more pragmatic approach if we talk of classifying docu- 
ments according to their knowledge content than if we 
talk of classifying knowledge. 


2.1. HIERARCHICAL DOCUMENT SYSTEMS 


In this category we can distinguish between universal 
and specialized systems. The field of interest is here 
divided from the above in classes, subclasses, etc., down 
to a certain level. For each document we try to find one 
or more places where the document as a whole has its 
domicile. 


2.1.1. UNIVERSAL DOCUMENT SYSTEMS 


This is the category of the well-known library 
systems as Dewey, UDC, etc. The systems used in the 
book trade also belong here. 


A. Main Structure 


The main structure of these systems is generally influ- 
enced by the division into disciplines as adopted in the 
universities. The production of scientific literature fol- 
lowed in old days this division in broad outline and could 
therefore be easily fitted in the pattern. 

The similar organization of studies at the. occidental 
universities meant that the building blocks in the schemes 
of different countries were roughly the same, which facili- 
tated an international diffusion of certain systems. 

The traditional sequence, starting with philosophy, 
religion, and sociology, was accepted by Dewey and was 
characteristic also of the UDC based on Dewey’s scheme. 
For the applied sciences one special main group was 
introduced. 

As time went on, the document systems acquired an 
increasingly detailed structure, especially within the prac- 
tical fields, and the connection with the traditional disci- 
plines was partly dissolved. 

The subdivision of the applied sciences and other prac- 
tical fields was generally based on the differentiation into 
industries and professions and did not stand in any care- 
fully thought-out correspondence with the system of 
sciences, but partially duplicated the latter. This was a 
natural arrangement, for within other fields—mathe- 
matics, biology, law, art, etc.—it was to a large extent 
particular professional categories that corresponded to 
the document groups, both as authors and readers. 
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B. Principles of Subdivision 


We wil first investigate the principal structure of 
UDC—as а representative for this category of systems. 
'The main classes have already been dealt with. On lower 
levels UDC displays in some parts real generic divisions, 
e.g., in language, botany, zoology. In other parts we-have 
non-generic but strictly hierarchical divisions, implying 
that the subclasses are equal and parallel. This is valid, 
eg., for anatomy, for physics (the traditional subdisci- 
plines), for many parts of technology, ete 
~ То a great extent, however, the subclasses correspond 
to different aspects of the class, the aspects having their 
domicile in other parts of the system. Àn example: 


631. Agriculture, farming in general. Agronomy 


1 Farm management 

2 Farms and farmyards 

З Agricultural арале tools and machinery 
(e.g., ploughs) 

4 Boil science (e.g., physical properties) 

5 Growing, cultivation methods (e.g., ploughing) 

6 Rural engineering (e.g. drainage) 

8 Fertilizers. Manuring (e.g., nitrogen fertilizers) 

9 Other topics 


Most subclasses are applications of one special science 
or technique (economy, building, machinery, geology, 
etc.). We have now definitely left the “tree of knowl- 
edge” and the connection with the systematic of sciences. 
The system is a tool for collecting—in a collection com- 
prising more than agronomy—all documents written for 
agronomists. The codes could also have been derived by 
combining 631 with other UDC-numbers. 


C. Precoordination of Concepts 


It may be reasonable to ask how in a document gystenr 
we can introduce a limited number of concept combina- 
tions from the very great number of possible combina- 
tions, and if it is possible to design rules regulating which 
concept of two should have preferences. Is a universal 
document system perhaps on the whole an absurdity? 

First, we have to consider that a document can often 
(depending on the size and the character of the collec- 
tion) be classified only with the help of a single concept. 

Second, to a certain extent, precoordinated concept 
combinations can be introduced and be a valuable 
feature. 

Not all combinations of concepts have any real impor- 
tance; the number studied and written about in the 
literature is in fact limited in relation to the number of 
possible combinations, even if it is large. A specialized 
documentalist wil recognize certain types of combina- 
tions which often recur. 

If a document deals with the соге а апа b in rela- 
tion to one another, and both of them are represented in 
the system, the document can naturally be classified both: 
under a and 6. But if ab is an important combination, 
which has caused a great production of literature, there 





ішіуе collection and for a special collection (e.g. a 
ihg trade library), but perhaps not for a specialized 


On a purely generic basis the literature on concrete 
would be classified according to type of concrete. But as 
the literature is dominated by normal concrete made of 
Po id cement and ordinary stone material, it is per- 
haps more appropriate to form a system such as the 
following: 


Concrete 
Manufacture of concrete 





roperties of concrete 
General 

Mechanical properties 
. Strength 

Hardness 

Physical properties 


Special types of concrete 





Combinations of concepts such as concrete : strength 
thus have their given place in this arrangement. The 
combination strength : concrete should in such case not 
be used. 

In ‘many cases the priority of the concepts a and b 
ean ђе questioned but in other cases one sequence is 
definitely to be preferred. The building library will surely 
prefer ‘to bring together all documents on concrete in- 
stead|of bringing together all documents on strength. 
“Stronger” concepts like special materials, things, etc., 
will have preference over “weaker” concepts, eg. 
properties. 

We thus conclude that concept combinations, tf care- 
fully selected, can be valuable features in a document 
system, but it seems very hard to make such a selection 
if the, structure of the system is not logically built up. 


D. Postcoordination of Concepts 


Even if the system includes the most important com- 
binations today, new important combinations wil come 
forward and we must have a method for coding them. 
How does UDC function in the case of unforeseen com- 
binations? Let us first see how UDC is presented by 
ЕТЮ, | We quote from the General Introduction to the 
trilingual edition (my italics) (10): 

Thé: Universal Decimal Classification (UDC) is a 

scheme for classifying the whole field of knowledge. . 


a classification in the strictest sense in: pipe on the 
analysis of idea content, во that rela concepts and 
groups of concepts are "brought together. 


ian integrated pattern of correlated subjects... . 








. .. constructed on the principle of proceeding from 
the general to the more particular by the (arbitrary) 
division of the whole of human knowledge. . . . 


We also read that: 


it should not be regarded as a philosophical classifica- 
tion of dge. 


a practical system for numerically coding information, 

so designed: that any item, once coded and filed cor- 

rectly, can be readily found from whatever angle it is 
sought. 

. the introduction of an auxiliary apparatus of con- 
nection and relation signs, lacking in the original Dewey 
system, has made the UDC really universal, in the 
sense that it permits almost any desired combination 
and modification of basic numbers to denote the most 
complex subjects. 

It may be true that the system permite coding of prac- 
tically any combination of concepts by linking two con- 
cepts with the colon sign or with the use of the auxiliary 
tables. But is this coding of such a nature as to provide 
sufficient guidance for searching? The concepts to be 
combined for forming complex subjecta are to a large 
extent already combinations of concepts (type ab or abc). 
The same concept is for that matter often located in 
moore than one place in the system. 

Can one not often express a certain combination of 
concepts in different notational ways. Perhaps ab : lm 
has the same signification as al : bm, etc., but how is one 
to choose the right entrance, especially if a concept is to 
be found in more than one place in the system? 

This combining of concepts, which cannot in themselves 
be regarded as simple, fundamental concepts selected on 
а strictly logical principle, leads to a fairly complicated 
procedure. Is this not a compromise between two logical 
principles? The principles are as follows: 

1. Classification with the aid of a document system 
based on both generic and other strictly hierarchical 


structures and moreover on certain precoordinated struc- 
tures (ab and іт) and 


2. Indexing with the aid of a number of equivalent 
terms (a, b, l, m). 
E. How to Get a Modern Universal Document System? 


- The first condition for a new universal system is that 
it should be linked up with the scientific pattern which 
has been generally accepted for more than 100 years 


(cf. Bliss) апа the second, that it should be designed on. 


the basis of fundamental concepts instead of traditional 
headings. 

'These questions have been dealt with above with re- 
spect to the system of science. А proposal for the main 
strueture of & document gystem based on these principles 
has been made by the author (11). Here the possibibty 
is also advanced of different universal document systems 
for different main fields-of activity, each system being 
universal] but designed from а special aspect. (See Sec- 


tion G.) The proposal mentioned is especially intended 
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4. 


for technology and ealled T'US, ie., Technological Uni- 
versal (Document) System. The main groups are (cf. 
Table 2): 


1. Abstract videns bu io тј IA 
59. Force. Energy .......................... IB 
3. Matter. Matorials E Пе IB 
4. Wold- sad sedis аа Тал д ace e IC 
25, Ine. Man; cx enge lee I we RES II 
6. The individual. Humanistic fields ......... ‚ш 
7. Босећу 5а б хата е еее тты ге ІП 
8. Material culture ......................... ШІ 
` 9. History. Geography. Biography .......... IH 


The prineiple behind the main groups is explained in 
Section 1, p. 203. One important factor for a system 
intended for technology is naturally the principle for the 
division. of technology. This question has here been 
solved ‘in such a way that all technical subjects, which 
can be related to fundamental concepts belonging to those 
parts of the system which are dependent on the system 
of science, are located in connection with these funda- 
mental concepts; while those parts of technology which 
are best characterized by their purpose, e.g., Transport, 
Building, etc., are collected in опе main group, called 
: “Material Culture" or “Functional Material Culture.” 
This group will contain subjects of primary interest 


to man and society (automobiles, buildings, ete.), while . 


that which lies behind (the engine, installations, theory) 
is considered as secondary and appears in connection 
with the fundamental concepts. Strictly speaking, two 
separate bases for classification have been used: 


A B 
(1) A plied sciences Nonapplied sciences 
(2) Not of primary interest Of primary interest to 
to man and society man and society 


These two principles of classification, however, lead to a 
great extent to the same result. We find the same ten- 
dency to division of technology into two parts if we.study 
the titles of the main branches of technological teaching: 


А.. Ву science: mechanical, gems, electrical (ete.) 
engineering 

: B. By- purpose: housebuilding, shipbuilding, aviation, 
mining, etc. 


` TUS has in the first place to be judged from the view- 

points of special fields of activity, which only collect docu- 
ments for their own needs. Every field has ita own main 
number in the system, under which the specific documents 
will be collected, but has to spread out other documents 
over the whole system as far as possible. Different appli- 
cations of the system for. different types of document 
collections are discussed in the sequel. 

These principles for designing a universal system were 
advanced as early as 1949 at a conference on Build- 
ing Documentation in Geneva (12). In the last years it 
has been possible to go a little further on this road and 
TUS has now been constructed in detail as regards those 
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parts which are of interest to the housebuilding field. It 
has been adopted for a bibliography comprising 25 years 
of the Swedish journals Arkitektur and Byggmastaren 
(18). By way of experiment the whole of UDC in its . 
trilingual edition has also been rearranged under TUS 
headings, which permits a considerable concentration 
of related fields of knowledge which are widely dispersed 
in UDC. 

The principle of attaching applied science to funda- 
mental concepts was advanced already by J. D. Brown. 
Now, after 60 years, it is fascinating to hear this voice 
from the grave: ` 


Its basis [Brown's system] is a recognition of the fact 
that every science and art springs from some definite 
source, and need not, therefore, be arbitrarily опр 
in alphabetical, chronological or purely artificial 

sions, because tradition or custom has apparently sanc- 
tioned such usage. The division seen in most classifica- 
tions in vogue—Fine Arts, Useful Arte and Science, are 
examples of the arbitrary separation of closely related 
subjects, which in the past have become conventional, 
and it may seem heretical even at this late. time to 
propose a more intimate union between exact and . 
applied science. | 


Brown's system was perhaps imperfect in some ways, but 
it nevertheless seems remarkable that the statement above 
did not influence the international development in classi- 
fication. 


F. Limitations on the Use of a Hierarchical Universal 
Document System 


The scheme need not enter more deeply into details 
than. a generally applicable differentiation of documents 
allows. When we come down to & certain level, perhaps 
the 3- or 4-digit level, we can no longer find principles 
of classification which can be generally accepted. The 
best subdivision for one institute or library may not be 
suited to others. A certain institute may very well 
arrange a detailed classification for its own use, but in 


` an international standard one should not go too deeply 


into detail. Even if a subdivision on modern principles 
into only a few groups could be generally accepted, very 
much would be gained. Other methods of search can then 
be employed both within a class and in combination. with 
other classes. Faceted systems, coordinated indexing, or 
permutated indexing may have their place here. 

The question whether one can classify a document in a 
document system so that it can easily be retrieved is tied 
up also with the size and the degree of specialization of 
the collection in which the document is to be stored. In 
a small collection comprising many fields of knowledge а 
single concept may suffice to provide a safe anchorage, 
and а universal document system is thus best suited for 
such а collection. Perhaps the widest application for 
universal document systems is in filing of documents i in 
offices and for every man’s use. 


i the collection grows larger, still with unchanged com- 
on, more documents wil come under every class 
er. This can hardly imply that the possibility of 

а is relevant documents will decrease. But if we want 
|. bigger collection with full effectiveness, it may 
лана а try to refine the indexing and search 
bk 
lí we now compare with a very specialized collection, 
it is quite clear that the advantage of the universality 
e system becomes smaller in the same degree as the 
documents are assembled under fewer headings. To char- 
acterize every document, we are forced either to get the 
clasg numbers in question more detailed in.& way that 
pus corresponds to the collection or tó use other methods 
indexing. 

pes have here, of course, presumed that the aim of the 

system is to display the collection in the widest possible 
way, and not to concentrate documents according to field 
of activity, ав the character of some groups of UDC 
seems to indicate. 


! 

| 
G. ШЕ Applications of а Document System for 
Dif ent Fields ој Activity 


B [fields of activity is meant fields with & common 
need|for documentation, their own special journals, etc. 
These fields may be scientific or social, sectors of industry 


or professions. Every part of the universal scheme corre- . 


sponds to one or more fields of activity. 
We also have a hierarchy—apart from the hierarchy 
in us document system—that could be exemplified by: 


‚ Science == “universal field” 
echnology = “suprafield” 
‘Building and civil 
engineering = “field” 
` Architecture == “subfield” 


With increasing specialization the special document col- 
lections are more important than the universal, and the 
first aim of a universal system is to be a tool for special 
collections. 

If we try to apply the principles mentioned above and 
draw|up Bystem struetures suitable for (1) the universal 
field, for (2) some suprafields, for (3) some fields, and for 
(4) sbme subfields of activity, it is not probable that we 
shall get the same result in all these cases. This is one 
reason: why the system presented above (TUS) is adapted 
toa “suprafield” technology. 

I do not think that it is suitable to aim at different 
document systems for the fields and subfields under tech- 
nology, but it may be justified to have different applica- 
tions of the same system. If we have a document "Elec- 
trical Installations in Buildings,” the classification allotted 
when publishing the document will consist of two equiva- 
lent notations: b for building and e for electrical. The 
“electrical library” may use 6 and the building library e 





as primary classification. While this is a simple example, 
the problem as & whole is rather complicated and cannot 
be thoroughly studied without the background both of a 
certain document system and of the outline of a division 
into fields of activities and their documentation activities. 

It is conceivable however that documents, as an aid 
both for filing in different collections and for distribution 
to the proper categories, might be classified both accord- 
ing to knowledge content and to receiver categories (fields 
of activity). 


2.1.2. SPECIALIZED HIERARCHICAL SYSTEMS 


Among the reasons why there are hundred or thousands 
of special systems in use in most countries, one is that 
the UDC numbers are too long and that specialized 
spheres of interest disappear in the universal collection. 
This is not a weighty reason, for one often overlooks the 
simple course of substituting the most heavily loaded 
numbers by letters and making a selection from the main 
system in which irrelevant titles are weeded out. A 
more weighty reason is that the structure of UDC has 
proved unsuitable for many particular fields. The absence 
of &.main group for materials, for instance, seems to be 
a serious defect. Only a magical revision would sues іп 
face of such oriticism. 

If one could get all those who now use асы systems 
to use & common universal system instead, the circle of 
users of the universal system would be multiplied many 


‘times over. The picture is, however, not complete with- 


out paying consideration to the faceted systems. 


2.2. FACETED DOCUMENT SYSTEMS 


The faceted systems drawn up in England are all in- 
tended for special fields. They are not, as are hierarchical 
specialized systems, constructed by dividing the special 
field of knowledge from above in certain parts. Instead 
they are constructed from below by organizing the most 
important concepts (represented by certain terms) in a 
limited number of series (facets) and the documents are 
characterized by means of one concept from some (or all) 
of the facets. Thus there is no precoordination of con- 
cepts involved. If we consider each facet as one classifica- 
tion system, the concepts represent the “subjects,” but for 
the system taken as a whole the documents represent the 
subjects. The faceted principle is more related to index- 
ing than to classification in general sense. 

What is here of special interest is that members of 
CRG are striving at the design of a universal document 
system with faceted structure. Main facets are Things, 
Activities, and Properties. A series for things arranged 
according to “integrative level” have been published. For 


: further information it is better to refer to various publi- 


cations of the members of CRG, especially perhaps to 
Foskett’s Classification and Indexing in the Social Sci- 
ences (14), where, moreover, a very interesting and ex- 
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haustive analysis of the whole classification problem is 
given. We quote from this book: 


What is certain is that a general classification scheme 
ig needed, that none of the existing schemes are satis- 
factory, and that we have enough ideas about the 
structure of a new scheme to make the effort to pro- 
duce one worthwhile. 


If we compare this project with UDC or with the enter- 
prise to design a new hierarchical document system along 
the lines advocated above or other projects going on, 
it would be better not to try to formulate an opinion 
concerning which may be the best way. Research and 
experiments in more directions and with organized com- 
parisons may be better than any attempt to compel the 
development into a single track. 


e 3. Other Systems Based on Knowledge Content 


А, EDITORIAL SYSTEMS 


Every author of a textbook, and every scientific writer, 
is faced with the same problems as the documentalist, 
the systemization of knowledge in some way or another; 
but in the editorial field these problems are not at all 
objects for organized studies. The editor of a techno- 
logical compilation, for example, or of a code of stan- 
dards, is presented with the problem of arranging a num- 
ber of “facets.” These must be chosen so as to bring out 
the most important concepts as headings; the number of 
facets must not be so small that they cannot accommo- 
date the desired contents. Nor must it be too large, for 


then there will be too many entries. The editor will then. 


have great problems in deciding on the choice of location 
of the knowledge elements and the reader will meet with 
difficulties in finding the information he requires.? 

Just as important as it is to keep documents in order, 
so it is to bring order into their contents by editing the 
documents. 

Here, as within documentation, we have the struggle 


between the alphabetical and the systematic principle. . 


The alphabetical has & great significance in encyclopaedic 
works, especially the bigger ones of a general nature, and 
in books of reference, owing to the fact that the general 


public do not want to choose between different possible : 


entries but go direct to a title-word they have in mind. 
Just as a document may occur several times in an alpha- 
betical index, so information can be duplicated in an 
alphabetical work, perhaps linked together by cross ref- 


3 An application of a specialized system with decimal notation has 
been realized on а large seale in the Swedish building handbook BYGG, 
published in the 19408 in four volumes (in the 1960's increased to 6 

~ volumes), with three numerals before the colon and а maximum of 

three after (eg. 242:518 Concrete; strength; compression; testing). 
This system has also been used for filing of documents and is often 
preferred to existing library systems. The same notation principles 
but with a universel system are used in the general encyclopaedia 
Fakta (Table 3). 


208 American Documentation — October 1966 


erences. For books intended for learning and education 
the systematic form is the obvious choice. 

Editorial systems have significance on the universal 
plane as well. Compilations of this kind exist both 
with the classical disciplines (mathematics, mechanics, 
physics . . . ) as headings, and with concept headings 
(number, space, time, mass . . .). Table 3 shows the 
systems in some modern encyclopaedias from different 
countries which conform closely to Table 2. 

Books of this kind should follow the systematics of 
science, i.e. present knowledge in а natural sequence, 
implying that elementary phenomena are described be- 
fore the more complicated and that new fundamental 
concepts are not introduced until the presentation so 
demands. Cross references should go backwards (toward 
the beginning of the book) and the need for them should 
be minimal. 


B. EDUCATIONAL SYSTEMS 


The most important phase in the distribution of knowl- 
edge is the education organized by society. Systems are 
necessary for dividing up the sphere of knowledge taught 
at each school and university into curricular subjecis. 
Obviously these systems must be based on the system of 
science. It should be possible to give every subject a 
definition by reference to a universal system. 

All educational planning should be based on a sys- 
tematic analysis of the specified requirements of knowl- 
edge for different occupational groups. The relation 
between education and documentation is mutual and sig- 


TABLE 3. Some Modern Encyclopaedic Works with Sys- 
tematic Structure (by Subject) - 


Larousse Metodique (Librarie Larousse, Paris 1959) 





' Aritmetique Astronomie Biologie 
Algebre Physique Botanique 
Geometrie Chimie Zoologie 
Mecanique rationelle Geologie etc. 


Universitas Litterarum. (Walter de Gruyter, Berlin) 


Matematik Mineralogie Medizin 
Physik Palaentologi Psykologie 
Chemie Botanik Volkerkunde 
Astronomie Zoologie Sociologi 
Geologie Anthropologie ete. 


Kleine Enzyklopadie (Verlag Enzyklopadie, Leipsig 1960) 
Natur 


Zahl ` Kraft (energi) Leben 
Raum Stoff Tier 
Zeit Weltall Pflanze 


` Mass u. Gewicht Erde - Mensch . 
Fakta (Bokforlaget Fakta AB, Stockholm, 1955—1961) and 
Facta (Ediciones Rialp, Madrid, 1962-1066) І | 
Mathematies Geology Countries and peoples 


Mechanies Meteorology Humanistie fields 
Physics Biology, general Society . 
Chemistry Plants, Animals Technology 


Astronomy Man- Home and environment 





nificant, but have the relations between curricular sub- 
jects and document systems anywhere been investigated? 


C. SYSTEMS FOR PATENTS AND LEGAL TEXTS 


Bo ls a patent specification and in a code of laws, 
certain relations are established between concepts which 
have 8 legal force. The problem of classifying or indexing 
such texts is, as is well known, closely akin to the general 
problem of documentation and these questions have been 
dealt with in the documentation literature. In fact such 
texts should be ideal experimental subjects owing to their 
concentrated form and the care with which concepts are 
selected, clothed in word form and syntactically compiled. 


* 4. Term Systems 


A. GENERAL 


Vocabularies in different languages have not been 
created|at one time as the result of a general study, but 
have developed successively among different peoples 
insofad hs they have felt a need for words. Science, tech- 
nology; апа social organizations are constantly producing 
many new concepts and require an extended and exactly 
defined vocabulary. The creation of new words is a slow 
process, and it is therefore not remarkable that the con- 
nection between concepts and words is poor. The nomen- 
clature organizations have the difficult mission of guiding 
the development into the proper paths. 

The|stock of terms is of great interest in documenta- 
tion. ‘The knowledge contained in documents can, as is 
well known, be indexed, either by a number (or several 
numbers) in a document system (“item-entry”) ог by a 
number of concepts or terms characteristic of the docu- 
ment. 1f we index by terms (“term-entry”) it is impor- 
tant to draw up a list of terms (descriptors or key words) 
which should be given priority both by the indexer and 
by the searcher. These terms are collected in a special 
word list, often called a “thesaurus.” 





B. DIFFERENT WAYS OF ARRANGING TERMS 


Тізе | commonest and simplest method of arranging 
terms is alphabetically, a convenient but, as we know, 
an often deceptive method. As complement to all other 
classifications, however, the alphabetical is indispensable. 

Then we have the grammatical division according to 
parts of speech which determine to a certain extent what 
syntactical role the word can play. The trilogy things, 
activities (processes), and properties correspond by and 


large to the three main parts of speech, the substantives, 


the verbs, and the adjectives, but the context between 
concepts and parts of speech is, as well known, not always 
ambiguous. Prepositions and other small words explain— 
together with a more or less extensive system of case- 
declensions—the syntactical context between the words. 





The logic of natural speech, however, seems to be insuf- 
ficient in qualified information retrieval, and different sets 
of “role indicators” have been devised by different docu- 
mentation schools or experta. 

There is also the etymological classification, which is 
based on the origin of words and cuts across the gram- 
matical classification. Words are organisms which are 
born, develop, change their meaning and spelling, and 
sometimes die. The etymological classification corre- 
sponds to the natural genus-species division in the plant 
and animal kingdoms. The synonyms are representatives 
of different etymological individuals, joined in the same 
concept category. The words “hound” and “dog,” for 
example, come in different etymological places but in the 
same concept category and so do “hen” and “poultry.” 
Composite words correspond to biological crossings. 

We can also classify words semantically, on the basis 
of the concepts the words are to convey. 

Concepts can to some extent be expressed by symbols 
or pictures without the help of words. Geometrical figures 
can be easily illustrated; the elements, too, by means of 
their electron shells; chemical compounds by their mo- 
lecular diagrams; cells, tissues, plants, and animals can 
be illustrated, as can technically produced objects. (Maps, 
diagrams, and tables can also exhibit very complicated 
relations without words.) Illustrated glossaries are not 
unknown and have a function to fulfill within documenta- 
tion. 

The language of symbols is an intermediate step be- 
tween concepts and words. Can one imagine a complete 
universal system of systematically arranged symbols? 

Terms can also be expressed by definitions, a new term 
being defined by means of other already defined terms. 


.This implies а certain systematic arrangement, e.g.: 


3. A figure bounded by three straight 


lines: triangle 
4. A figure bounded by four straight 
lines: quadrangle 
41 Ditto with right angles: rectangle 
4.11 Ditto with right angles and 
equal sides: square 


Here, as a matter of fact, we are approaching the edi- 


torial principle of presenting knowledge in text books by - 


building a system of concepts step by step (ef. Section 3). 

To a certain extent terms can be easily arranged in 
groups which are, more or less, generally accepted, such 
as the elements, chemical compounds, rock species, 
plants, animals, etc. If we strive at a fixed unambiguous 
location for every word, we must always keep to the 
fundamental meanings of the words. Granite, for m- 
stance, is a geological term (a species of rock) and not 
a building material or a product from the quarrying 
industry, for the definition of the word belongs to the 
sphere of geology. Rose is a botanical term; dog, & 
zoological. 
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Lists of terms arranged with regard to the semantic 
signification of the words are needed within documenta- 
tion. A scheme for a complete, systematic list of terms, 
in which each term is as far as possible located according 
to its meaning, we shall here call an unambiguous se- 
mantic term-concept system or, more simply, a term 
system. 

'The elementary terms in the sciences аге generally not 
created for the sciences, but for more primitive needs of 
man. Hot and cold are original not physieal terms, 
they are expressions for certain feelings of man. How- 
ever, the sciences have need of such elementary terms, 
and allot them fixed positions and clear definitions. We 
will get a good guidance for an unambiguous location 
if we give preference to the more fundamental locations 
in the series of the sciences and concepts as shown in 
Table 2. 

To a very large extent even terms considered to be 
“general” can thus be assigned & place according to the 
sciences. Large, small, increase, decrease will go to 
Mathematics; long, short, round to Geometry; rapid, 
slow to Kinematics; force, equilibrium to Statics; hot, 
cold to Thermodynamics; beam, reflex to Optics; granite 
and syenite to Geology, eto. 


C. SURVEY OF EXISTING TERM LISTS 


One might have expected there to exist & systemati- 
cally arranged glossary of all our technical terms; at all 
events in some language; in fact there appears to be 
none, at least none that is up-to-date. 

A well-known, systematic glossary of universal char- 
acter is Roget’s Thesaurus published in England in 
1852—which was enlarged in 1952 (15) and has a wide 
use as dictionary of synonyms for writers and editors. 
The original system has been retained, but this is unac- 
ceptable for modern science and technology as physical 
and technical concepts are derived from the human senses, 
which leads to corresponding gaps in physics and tech- 
nology. Example: 


Matter Intellect : 
Organic matter Communieation of id 
Sensation Modes of communication 
'Touch, Heat, Publication 
Taste ... journalism, 
... radio, 
... television 


Coordinate indexing has brought a new wave of inter- 
est in technical terms, and attempts are being made 
within all branches of technology to perfect their stocks 
of words. There have been discussions concerning the 
use of UDC as a systematic term list; but UDC is not 
constructed as a term system, and the same terms may 
occur 5 to 10 times or more, whereas a large number of 
the elementary words are missing. One would first have 
to rebuild UDC so that the same term occurs only once, 
and the principles of such adjustment of the system 
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would then have to be investigated, which to some extent 
is equivalent to constructing a new system. 

It is imaginable, however, that large parts of UDC, 
eg., within chemistry and electrotechnics, could be used. 
And as UDC is published in several languages, the use 
as far as possible of this system, and possibly other classi- 
fication systems, should be valuable. 


Example from the trilingual edition of UDC: 


German English French 
621.375 
Rohrenverstarker Valve amplifiers Amplificateurs а 
pes 
Magnetische Magnetic amplifiers Amplificateurs 
erstarker (transductor) magnetiques 
Kristallverstarker Crystal amplifiers Amplificateurs a 
Transistoren (transistor) cristal 
Transistors 
Dielektrische Dielectric ampli- Amplificateurs 
Verstarker fiers dielectriques 


А glossary of terms common to the whole of technology 
and intended for coordinative indexing is the Engineering 
Joint Council (ЕЈС) Thesaurus (16). The terms are 
alphabetically arranged, but with references to “broader 
terms,” “narrower terms,” and to otherwise related words 
(“refer to”), and from words which should not be used 
as descriptors to the respective descriptors. These ar- 
rangements may be interpreted as a system of “micro- 
hierarchies,” but no attempt has been made to bring 
together all terms into a single systematic totality. 

A number of special thesauri have been compiled for 
different fields, and in them one sometimes finds a sys- 
tematic classification of the terms of that field. In a 
thesaurus published by Euratom (17) for nuclear tech- 
nology, for instance, we find a classification of the vocabu- 
lary into 42 groups (key word groups). It may be of 
interest to see whether two institutions select and arrange 
the terms within the same field in the same way. Within 
the field of radiation and electromagnetic waves Euratom 
arranges the key words in four groups: 


72. Optics 

73. Magnetism 

84, Radiations 

85. Elementary particles 
If wè compare the 17 key words under Radiation in 
Euratom with the 28 “narrower terms” under Radiation 
in EJC, we find that only one is exactly the same in the 
two groups, namely Cosmic Radiation, while the follow- 
ing three are similar in their import but differ in spelling 
and linguistic form. 


Cerenkov radiation — Cherenkov radiation 

Gamma rays — Gamma radiation 

X-rays — X-radiation 
Otherwise the selection of terms is entirely different in 
the two thesauri. The compatibility between them is 
obviously very poor. As regards special thesauri the 
following points may also be noted: The barrier between 


science and technology that is characteristic of UDC does 
dst here. The key word groups thereby have a 
homogenous character, and the structure is more 






cerning terminology, on the other hand, one notes 
a di erging tendency since each special branch of tech- 
nology uses general terms in a specialized sense for the 
branch. In a thesaurus for roadbuilding, words such as 
base,| bed, carpet, coat, deep, junction, etc., appear in 
the key word group “Road.” This may be acceptable if 
road documentation is to be considered an isolated ac- 
tivityl but if this activity is to be included in a broader 

contest, one will have to deal with different vocabularies. 
Closet agreement between terms and headings is found 
in key word groups such as “Mathematics—Mechanics” 
and “Opties. » 

Ап ‘interesting example of a list of terms is Medical 
Subject, Headings (18) of Index Medicus, a list of bio- 
medical terms arranged both alphabetically and system- 
aticall | (“categorized list”). We quote: 


The Сашка List is an attempt to bring related 
tel together where the indexer and searcher can 
vie à panorama of subject headings and select the 
most appropriate heading for his needs. 


The introductory categories are: 


Anatomical terms 
Organisms 


iseases 
hemicals and Drugs 
. Analytical, Diumnosde, and Therapeutic Technics 


d Equipment 

"n of subcategories: 

Al. Parts of the Body 

А2. Musculoskeletal system 

АЗ. енш system ` 

The terms (“subject headings”) have usually Бена un- 
ambiguqusly locatable within a main category, but there- 
under they often occur within several subcategories. 
Example: “Hand” occurs both under Al and A2. As 
cross references are given, this perhaps is of no particular 
inconvenience, but by adjustment of the category head- 
ings it should. be possible to achieve unambiguous loca- 
tions to В greater extent. Generally speaking, one may 
say that!A, B, and C of Indez Medicus could be incor- 
porated in a universal system of terms. 

It is interesting to compare these categories with the 
headings which are usual in document systems: 


Index Medicus (term UDC (subject fields) 


Бронь 


са Maa 
А. Anatomical terms 58. Botany 
B. Organisms 59. Zoology 


- 61. Medical Sciences 


ча 
The pattern in the "realm of knowledge" is not the 
same as in the “realm of terms.” Botany; Zoology, апа 


Medicine have largely the same stock of terms as the 
groups А, B, and C. But if terms are collected and 
arranged from the bottom upwards as in a thesaurus, 
one comes automatically upon certain groups which do 
not always agree with the traditional sciences and are 
found in document systems for which the classification 
is made from the top downwards. 

One can easily find out which classification of a col- 
lected stock of biomedical terms results in the largest 
number of duplicate locations. That, at all events, 
Zoology and Medicine have a common stock of ana- 
tomical terms is natural since, anatomically, man is an 
animal. Thus the A, B, and C grouping above must in 
this case be better for the term system. But the following 
term categories are also conceivable: 


Botany Zoology and Medicine 
Anatomical terms Anatomical terms 
Organisms Organisms 
Diseases Diseases 


In a universal term system, of course, the group Chemicals 
must not come under the heading Biomedicine, but ap- 
pear earlier in the systematic sequence. One may then 
perhaps find that Drugs as well can be removed from 
the medical field. 

In the field of materials technology I have made some 
attempts to systematize the stock of terms (materials, 
processes, properties), and the result suggests that an 
unambiguous systematic. grouping of these terms is not 


. 8 hopeless task. It is also quite clear that one arrives at 


hierarchies of an entirely different kind than in a docu- 
ment system, as will be shown. 

In the section on document sytems (2.1.1.C) we have 
given an example of a possible subdivision of a document 
collection on concrete, corresponding to the composition 
of the stock of documents. In a term system only words 
relating exclusively to the concept “concrete” may appear 
under the heading—words such as standard concrete, 
lightweight concrete, etc.—while terms for properties and 
processes appear elsewhere. In the document system the 
hierarchies are to & certain extent expressions of com- 
binations of concepts which have been regarded as suited 
to & collection of documents and а document on "Btrength 
of Concrete” has here a fixed location. In the term нув- 
tem one is concerned only with generic hierarchies (each 
term being defined by superior headings). Тһе same 
document has to be indexed with help of two terms (as 
in а faceted system). (Fig. 2.) 

А universal, systematically (and alphabetically) ar- 
ranged thesaurus can be created by successively sifting 
the vocabularies from different subject fields through the 
“screen” of a universal system. The dilemma is that a 
universal term system is lacking. There are two ways 
of creating such a system: by way of theoretical specula- 
tion or, practically, by proceeding from a universal docu- 
ment system. This must, however, be successively ad- 
justed to a systematized arrangement adapted to terms, 
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Materials 


Inorganic 


Properties 


Mechanical 


Hardness 


2. Hierarchy in a term system 


CEPHEICÉXILIIIDIISIIIIEIETIILISIDIDIIIDIÉÍLILRILADTmDILII 


Materials 
ee 
Inorganic materials 





Normal concrete 


Organic 


Processing 


Physical 


Properties 


Properties of materials 


Mechanical properties 


Lightweight ` dE Шет 


сопсгефе 


we 


— — и 


Ела. 2. 


as is shown by the comparison between Index Medicus 
and UDC. The more conceptual and the less bound to 
tradition the system used, the easier it seems to be to 
transform it to & term system. 

Attempts on a smaller scale with assortments of vocab- 
ularies from different quarters have been started with 
the help of TUS as а provisory term system. Every term 
is given a number according to TUS and then sorted 
both on the basis of this number and alphabetically. With 
the aid of punched cards a project of this kind can be 
very well carried out on a large scale and the value would 
be greatest if the work could be conducted on a universal 
basis. 

To avoid misunderstandings it should be emphasized 
that a term system constructed on the principles pre- 
sented here will be quite different from a concept system 
for classifying documents. It has been shown that a docu- 
ment system of the normal model is not suited as a term 
system. But can a term system be used as a document 
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System or as the framework—in one sense or another— 
of a document system? 

Obviously many traditional subject fields will be 
broken down (cf. Index Medicus). But if we think not 
of the universal libraries but primarily of the special 
libraries where often this breakdown has already been 
made, it may be wise to postpone expressing an opinion 
until the question has been examined. 

Opinions among documentation experts concerning the 
possibility of creating a universal system of terms can be 
illustrated by some quotations. Vickery (19) puts the 
question: 


Is a universal descriptor language possible which em- 
braces all subject fields, and is usable in all retrieval 
systems? ... Perhaps the present area of specializa- 
tion in retrieval will be sueceeded by a synthesis, lead- 
ing to the general use of a common “interlingua.” 


This question comes back unaltered in the second 
issue of his book (1964). Evidently nothing has hap- 


pened in|three years that has made it possible for Vickery 
to give an answer to his question. 

Mortimer Taube had some doubts on the possibility 
and wrote: “. . . no one has actually produced a general 
categorization or classification of a total vocabulary.” 

The üniversal facets of CRG must in this connection 
be of interest 88 Foskett presents the theory of integra- 
tive levels as & principle for dividing terms into groups. 
The relation between document systems and term systems 
here comes in a new light. 





• 5. Practical Systems 


By practical systems is here meant systems used in 
industrial enterprises, in trade, and in other practical 
оооба, aimed at the classification of products, costs, 


te. d for filing purposes. 


А. PRODUCT BYSTEMS 


S for technically produced things we may call 
prod systems. All industrial enterprises need such 
to maintain order in their manufactures and 
often spend large amounts of money on them. They are 
used for stockkeeping, for registration of drawings, for 
filing of documents on products, for costing and indus- 
trial planning. Often an entire trade has a common 
system (e.g. the hardware trade, the furniture trade). 

Products сап be classified from different points of 
view. They cannot as natural things (plants, etc.) be 
classified according to genus-species relations, but the 
raw material and the industrial production process offer 


а albas to some extent. Material, shape, and pur- 





pose are for many products the most important factors; ` 


and a|conerete drainpipe may thus be classified as pipe, 
as ete product, or as products for sewage. 

These remarks apply primarily to bulk materials and 
to semimanufactured products. The classification of 
machinery and apparatus offers greater difficulties since, 
in this case, the principle of function (a particular physi- 
cal principle, for example), or a category of user, may 
offer the best classification (e.g. electrical or optical 
equipment, office supplies, ete.). A universal system 
would|seem to be necessary as a basis or background 
for compiling a comprehensive system for the latter 
categories, 

The degree of complexity can also be used as a first 
principle of classification, and under the main groups 
derived in this way various principles of subdivision can 
be employed. 

‘Product systems in industry аге often used also for 
classifying documents concerning products; this is some- 
times |their main aim. Sometimes these systems acquire 
the character of complete document systems, with space 
also for sciences, social questions, and the like. In such 
case they often compete with the system used in the 











library, which in turn may be elaborated in detail into 


& product system. UDC tends in this direction but is 
generally not used ав product system; as а rule indus- 
trial enterprises draw up their own systems which they 
eonsider better suited to their specialties. 

The existence within the same firm of two or more 
schemes with different structure, but comprising partly 
the same concepts and terms, cannot be an ideal solution. 
This now seems to be the rule, however, and a big firm 
or institution sometimes has even three, four, or more 
schemes which perhaps could be replaced by a single one. 

The need for a document system which is also a com- 
plete industrial product system—or at all events con- 
stitutes a basis for such a system—is fully legitimate. 
One may then ask whether this is an unreasonable re- 
quirement, or can one imagine a universal classification 
system which can serve satisfactorily both as document 
system and as а basis for a universal thing-and-product 
system ? 

It is naturally not possible to construct a universal 
product system simply by adding together a number of 
such industrial systems, as they usually overlap to a great 
extent. Perhaps a universal product system could be 
made as а faceted system with facets for material, shape, 
function, etc. 

A standardized facet for materials and another for 
form may be the best way to start with, if we strive at 
Some uniformity in this field, and these facets could in 
some way be included in a universal document system. 
The function facet could .а]во be a connecting link with 
а universal document system. 


. B. SYSTEMS FOR PROCESSES AND LABOR 


Materials and products are the “substantive facet,” in 
industry, processes, and labor the “verb facet,” and prop- 
erties the “adjective facet.” Processes include both intel- 
lectual work and manual and mechanical work. In this 


` context we shall not be concerned with intellectual work. 


То get-& classification for processes and labor is more 
difficult than for materials and products. 

Some processes can easily be classified according to 
processed material (in the building trade woodwork, 
concrete work, etc.), but not all material processes or 
material treament can be classified thus, as they are 
applicable to all or some kinds of materials (e.g., welding, 
drilling). Other processes must be classified on the basis 
of the finished product (eg., installation or assembly 
work) or as “auxiliary processes" which cannot be related 
to either a particular material or a product. However, 
it is not impossible to get a generally acceptable division 
of the processes in industry; such a division is essential 
and has already been accomplished thousands of times. 
We have in fact only to consider which solution is the 
best one, if we want to get a standardized system of this 
kind. 

It would be possible to obtain some correspondence 
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between systems for processes and systems for materials 
апа products, and 15 also seems possible to include а 


category of material treatment processes in а universal 


document system. 


C. SYSTEMS FOR PROPERTIES 


It seems to be difficult to construct a universal system 
for properties of all kinds. For the properties of mate- 
rials, on the other hand, a system associated with the 
systematics of science may be fairly easily constructed, 


as certain special institutes have devoted much effort to 


studies of all the properties of materials. The study of 
materials indeed consists to a large. extent of determining 
their properties expressed in units of measure. These 
units, which are composed of a limited set of basic units 
(ef. Section 1), are a good guide to the systematics of 
properties, which is closely related to the classical sub- 
divisions of the natural sciences: 


Mechanical properties 

Thermal properties 

Electrical and magnetic properties 
Cement properties, 


A system for properties of materials is valuable for all 
industrial enterprises producing materials in their speci- 
fication of properties. It is also of interest for testing 


materials and for general consumer information. This is a 


necessary step also when drawing up а list of terms for 
technology. 


Summary 


Тһе purpose of this paper is to point out the fact that 


"we in documentation have use not only of one kind of 


classification systems but of different types distinguished 
by different types of subjects. 

Тһе diseussions are often confused as the primary 
purpose of the classification is not made clear. Some ex- 
perts talk of classifying knowledge, others of classifying 
concepts, ideas, terms, etc. We will get a more fruitful 
and pragmatic approach if we consider the type of sub- 
ject of the general library classification systems to be 
documents, even if the system is constructed on the basis 
of fields of knowledge or on concepts. 

In documentation we also have a need of a classifica- 
tion system for terms, a term system, based on the se- 
mantic signification of the terms. The demands on the 
hierarchical structure is always determined by the subject 
thus for term systems by the terms themselves. The 
patterns in the “realm of knowledge” is not the same as 
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in the “realm of terms.” To investigate principal differ- 


ences in these different patterns seems to be an important 
task. 

Further on we have in all industrial enterprises, etc., 
need for practical systems with different objects, eg., 
systems for products. These can be used for specifica- 
tions of products, for cost calculations, ete., and also for 
filing of documents on the products. 

It is important in developing classification systems of 
different kinds to strictly have the subjects in mind. It 

may be that we want to use a document system for 
classifying terms or a term system or a product system 
for classifying documents or a document system for classi- 
fying products, but we have to be aware of the fact that 
we then are using these systems for a secondary purpose. 
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- Letters to the Editor 


June 28, 1966 
Dear Sir: 


Ashley Speakman reporting (AD Newsletter, May-June, 
1966, pp. 2-3) a panel discussion on “Tntellectual ‘versus 
Mechanical Indexing" (at the May 12-13 “Colloquium” on 
Information Retrieval) writes the following: 


. intellectual indexing is the order of the day, primarily 
for economie reasons; and mechanical indexing will re- 
place it, particularly in large information systems. There 
was agreement on this, but not on when. 


Mr. Holm and Mr. Liston (panelists) can speak for them- 
selves, if they care to, but it was not my impression that 
they agreed with this thesis, In any case a number of state- 
ments and questions from the audience (more than an hour 
was devoted to audience-panel interchanges) indicated that 
the meeting as a whole did not. 

Was the Newsletter report written by Mr. Speakman or 
by an experimental computer program? 


Јонк O'CoxNonR 
RD. 1 
New Tripoli, Pennsylvania 


July 7, 1986 
Sir: 

During the past seven years as Director of the Offce of 
Documentation at the National Academy of Sciences- 
National Research Council, it was a part of my responsi- 
bility to be familiar with science information projects іп 
ШШ, subject felds. Among these were (in no particular 
order): 


Behavioral sciences 

Brain, science 

Textile and apparel research 
Highway research 

Food and Nutrition 
Automatic language processing 
Chemica] information and coding 
Critical data 

Prevention of deterioration 
Geological sciences 

Building research 

Biomedical communication 
Cardiovascular literature 
Nuclear physics 

Foreign science 

Metallurgy 

Pacific sciencs 


The point I wish to make is the similarity from a pro- 
fessional documentation point of view of the problems 
among this wide spectrum of subjects. Over and over again 
the fundamental problems of procurement, languages, classi- 
fication, indexing, etc., came up. It is more than ever my 
firm conviction that a real subject resides in science infor- 
mation work, and I feel I would be remiss if I did not 
express this as forcefully as I can on the basis of my ex- 
perience. I wish to do this now especially, as others have 
questioned the validity of a documentalist’s approach. 

Even a casual inspection of the above list will show some 
apparent overlap, but this makes my case even stronger, 
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as the initial viewpoints were often quite different in appar- 
ently related areas. 


Kart F. HEUMANN 
6410 Earlham Drive 
Bethesda, Maryland 


July 7, 1986 
Sir: 
In addition to my congratulations on Documentation 
Abstracis which I have expressed elsewhere, may I make two 
points as a result of careful reading of the first issue: 


1. The section Classification contains a total of 22 en- 
tries, divided into 5 original articles from the U. В. and 
17 from other countries. In my opinion this is a rea- 
sonably accurate reflection of the way the interest in 
this subject is divided. 

2. The index parameters chosen (“Author” and "Anony- 
mous Titles”) are good first choices, but I want to 
suggest one other easy one. I refer to the possibility 
of using the many acronyms as entries. I have found 
23 such items from AESOP (262) to UPLIFT (200), 
comprising 30 total entries, and I submit such а com- 
pilation would be both simple and useful. 


Кан, Е. HEUMANN 
6410 Earlham Drive 
Bethesda, Maryland 


Dear Sir: 


Most discussions of the information retrieval problem 
start out by stating that information is & national resource. 
I wonder if we might not get a better perspective of our 
problem if we were to regard information as a waste- 
produet. 

Considering information to be а resource carries the idea 
that this pile of paper and words has great intrinsio value, 
that the quantity of it is limited, that its use must, some- 
how, be budgeted lest the source run dry. It seems to me 
that this corresponds very poorly to the true situation. 

In fact, information is a by-product of most of our scien- 
tific and technical activities. There is so much of it that 
our capacity for пош of it is overtaxed. Тһе scientist 
writes papers describing what he has done. The engineer 
writes reports. 'The inventor prepares patent disclosures. 
All of these are by-products. Some have no value at all 
beyond the initial distribution, recording of receipt, and а 
single reading by the receiver. The document describing 
even the most important discovery has only a limited useful 
life. If it has great significance, it will soon be boiled down 
28 part of а book covering the area to which it relates. ТЕ its 
significance is minor, it will be noted by the author's peer 
group апа then neglected. In а good many cases, the infor- 
mation will have been passed on to the peer group in oral 
or preprint form well in advance of its publication. The 
published material will have its primary value as a source 
document for a later review or a still later book. 

À document, once printed, tends to remain in existence. 
Perhaps even more important, we tend to regard every 
document as something that must be abstracted, indexed, 
SDId, KWIC'd, microfilmed, mag-taped, and on and on. 


| 
| 
| | 


а» there а good ibility that most of those documents 
ave lost almost all of their value by the time they get to the 
bstracting service? Isn't there а good possibility that the 

pile or paper is really a slag heap, not a mountain of pay 

1 

| Harry Baum, Director 
Technical M eotings Information Service 
New Hartford, N. Y. 


May 11, 1966 
Dear Bir: 


For some time it has concerned me that no one has 
mien the history of ADI to date or has taken effective 
ps to gather the materials necessary to such а history. 
"Recently some of my students struggled to write papers on 
the subject, and І remember that Dr. Kaiser, former Execu- 
tive Secretary of ADI, had great difficulty assembling a 
cbmplete set of AD, compiling a list of past presidents, and 
a on. These activities have pointed up the need for better 
housekeeping, but I think they point to something more. 
ve have Dr, Watson Davis, founder of ADI, Dr. Luther 
vans and Mr. Scott Adams, both past presidents, and many 
a interested and knowledgeable people within the New 
rk-Washington area as first-hand sources of unrecorded 
information. We should be motivated to confer with them, 
ube what they can give us to fill in the record and write & 
more interesting and authoritative history than one which, 
in the future, will have to be based on secondary sources 
alone. 
$ 





l am about to present & ue roposal which could have gone 
directly to Council, but I thought there t be value in 

it publie via this letter во inte persons could 
send their reactions and, perhaps, offer suggestions, infor- 
mation or materials that bear on the proposal. 


It is proposed that the Council of ADI appropriate 9750 
for compilation of an official history of ADI, to be pub- 
lished in American Documentation as soon as it is com- 
pleted; $500 is to be an honorarium for the compiler, the 
remainder, a travel fund for the compiler to mterview 
the persons judged best potential sources of useful infor- 
mation. Cost of gathering or reproducing pertinent source 
materials is to be charged separately to I; these mate- 
rials are to be deposited at the close of the project for 
incorporation in the official ADI records. This proposal 
is to be altered or enlarged upon at the discretion of 
Council when it comes before them for action. 


One of the graduate students at Drexel Institute has 
written a partial history during this schoo] year that is moat 
readable, but its content is limited to information from sec- 
ondary sources. She, Mrs. Lillian Shreve, has expressed 
interest in being considered a candidate for the job of 
compiler should the Council act favorably on this proposal 


in the near future. 


Стлтвв K. ScuuLTZ 

Senior Research Assoc. 
IAMC 

Line Lexington, Pennsylvania 


Ameriean Documentation — October 1966 217 


Book Reviews 


10/66~1R Faceted Classification Schemes. 1966. Brian 
C. Vickery. Rutgers Series on Systems for the Intellectual 
Organization of Information, vol. 5. Graduate School of 
рау Service, Rutgers University, New Brunswick, N. Ј. 
108 pp. 


It is not, I suppose, the place one would expect to find it, 
but volume 5 of the Rutgers seminars has for the reader of 
Othello and The End of the Affair a similar treat in store: 
a constantly problematic time-scheme. Those who have 
read volume 2 in the same series (J.-C. Gardin's SYNTOL) 
have had the same sort of tangled skein offered to them 
already. If it does not seem to me too much to expect 
that the presentation itself be simply reported, followed by 
the panel-discussion comments—except that the size of 
such a transcript might demand some abbreviation. 

This has not been done here, however. Each of the 
Rutgers seminars is intended to present one system (or, 
as here, one family of systems), but to do so not in a text- 
bookish style; the lecture style plus discussion is ta be the 
instrument of this opening up of the texture of the argu- 
ment, But if the whole thing, instead of being edited 
(with, perhaps, substantial abridgement), is given back to 
the author of the central lecture, who may then turn it 
into a collage displaying (as here, and as with Gardin's) 
a totally different seriality than the event itself constituted, 
we may well expect some degree of confusion. The con- 
fusion arises when the central lecturer, forgetting that his 
readers cannot, like him, relate the new string of (printed) 
words to the original string of (spoken) words, proceeds to 
explain things to himself while mystifying his readers. 

For this reason, perhaps, but possibly for others as well, 
the Gardin volume ( though most of its reviewers have been 
hesitant to ‘tick it off properly) was a near-total failure to 
convey what it intended to. 

Or at least it seems to me that there, is some intention 
to convey something: Mills’ volume on the UDC and 
Ranganathan’s on the CC convey something. But neither 
Gardin nor Vickery seem to be quite sure (a) of what it is 
that they are supposed to convey, or (b) of what sort of 
audience they are aiming at. In the volume at hand, for 
instance, there are extended passages in the early pages 
(but, presumably, chronologically posterior to the rest of 
the volume), drawn from remarks of rapporteurs such as 
Rees and Mooers, which are about classification in the 
most general sort of way possible. Now I will admit that it 
is wise to lay such a groundwork of p whenever 
one intends to launch out into unconventional theorizin 
or into criticism of some particular manifestation of oh 
general principles; but when the sequel is nothing but & 
comment upon а description of a single family of classifica- 
iions, and when the necessary background is simply the 
ceniral lecture itself, the conclusion cannot but be that the 
work as & whole is intended to convey introductory-level 
information. But if this is the case, the central lecture 
becomes all the more crucial in its original unity: discussion, 
especially if it tends to be corrosive, is appropriate to the 
report of a "seminar" intended for a reading audience thor- 
oughly aware of the fundamental principles and their mani- 
festations; statement (though not, hopefully, dogmatic; nor 
even, necessarily, apodictic) is the appropriate vehicle for 
introduction. 

It would be well for an introduction to the faceted family 
of classifications to be available to the large numbers of 
American classifiers unfamiliar with them, and therefore 
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unaware of the advantages of the analytico-synthetic ap- 
proach as the basis of mature criticism of the current strate- 
gizations most in use by them: Library of Congress Classi- 
fication (and Subject-Headings), and Dewey Classification. 
Such critical evaluation can best (or at least most easily) 
be made in terms of comparisons of the e оаа of 
the underlying principles, rather than of the principles 
themselves, as stated in such theoretical form as (say) іп 
Ranganathan's Prolegomena to Library Classification (Lon- 
don, Library Association [2], 1957). Once the results are 
seen, these more difficult backgrounds can be explored. 
However, and here we come to the central failure of the 
volume in hand, it is not with such an introduction that we 
seem to be dealing. А 

Sections I and П (“Introduction to Faceted Classifica- 
tion” and “Aspects of Information Retrieval”) are rudi- 
mentary, to say the least. And yet a good many passages 
are technical, not to mention highly speculative. Such 
speculation is of course to be encouraged if it leads some- 
where—even in an introductory work. But the statement 
(p. 46) that "by combining terms in compound subjects 
[faceted classification as a technique of conceptual bibli- 
ography] introduces new logical relations between them, 
thus better reflecting the complexities of knowledge” in no 
way justifies itself; nor does it lead anywhere valid, since 
while it is indeed true that concepts not previously juxta- 
posed may well demand a new logical relation (or we could 
say “imply & new logical relation”), this relation is by no 
means given to us in any sort of helpful way by the fact 
of the juxtaposition. Again (p. 57) we find that “the only 
argument in favor of the use of symbols in & classified 
catalog is the actual classified arrangement of materials,” 
for the goal of generic/specific strategization of search. 
This is true, but it is not the only argument, since without 
the symbols attached to the terms there can obviously be 
no such thing as a class-achedule at all—unless we are 
willing to go all the way back to an alphabetico-classed 
arrangement, Hospitality, next (p. 57), is erroneously de- 
fined as the ability of notations to be "free to extend”— 
which misses the crucial point about hospitality (either in 
array or in chain) that 1 is what allows for intercalation 
of new notations for new concepts. On the next page “inter- 
calation” is used in the sense of the drawing in of notations 
from a more general classification, such as the fairly common 
use of the UDC (1/9)-table for place ан penumbral to many 
special schemes. Finally, many librarians (and bibliogra- 
phers, too, obviously) will take exception to the too-common 
documentalist-opinion (p. 70) that bibliographical descrip- 
tion is “basically clerical.” 

In а broader criticism of the book (as well as of several 
other members of the Series), it must be asked why there 
is no index; what do discussions of searching methods tell 
us about the (family of) system(s), unless they are methods 
peculiar to the system itself; and, most fundamental of all, 
how can one system be adequately described without com- 
parison to others? Thus, not all the blame for the failure 
of this volume can be placed with its author: his handicaps 
are many, and the temptation to try to weld the pieces 
together after the fact is not really an advantage. Yet 
Vickery (and Aitchison, in her “Case History: The English 
Electric Scheme" [section УЛП) has managed to outline 
the process by which a faceted classification is prepared and 
used. This is done sometimes not in the order most helpful 
to the uninitiate, though each difficulty is eventually ex- 
plained; it is done without overconfidence in the salvific 


а of such techniques, but at least sanguinely; but it із 
done, I fear, without adequate theoretical foundation. What 
we have here is & torso, perhaps helpful, but characteristic 
of the series to which it belongs; one can only hope that 
it, whets the appetite of those who need it, even if it does 
not totally satisfy it. | 


JEAN M. PERREAULT 

Lecturer | 

School of Library and Information 
Services 

University of Maryland 

College Park 





| 
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Тотев, 1965. Michael 
ridge, Mass. 246 pp. 





Computer Typesetting: Experiments and 
Т. Barnett. MIT Press, Cam- 


While Director of the Cooperative Computing Labora- 
ry (CCL) and Associate Professor of Physics at МАТ. 
Dr. Michael Barnett carried on developmental work in the 
eld of computer typesetting. After first seeing & Photon 
photo-typesetter in operation in a commercial plant early 
in 1961 he began a project that summer to activate that 
liachine by means of computer-produced paper tape. А 
grant enabled him to enlarge his staff and increase his 
efforts. Тһе work he describen in his book continued until 
early in 1964. At that time he returned to England and 
completed this study while at the University of London. 
per | back in the United States, Dr. Barnett is Staff Engi- 

leer in the RCA Systems Division at Princeton, N. J, 

aving apparently been bitten in earnest by the typesetting 
jug. 

Barnett’s book begins with a description of the computer 
programs written for typesetting purposes at CCL and an 
account of the manner in which the Photon device operates. 
At the outset of the project he knew nothing about the 
intricacies of composition, either in “hot metal” or on film, 
but he learned quickly and, as a pioneer in this field of 
computerization, achieved some significant results. 

He was not the first to conduct experiments of this nature, 
utilizing computers, but at that time he was ignorant of the 
work that others before him had performed. He was not 
aware of the comprehensive study, known as the Rome Air 
Force Project, subsequently published (in 1962) at the 
Thomas J. Watson Research Center* or of the experiments 
with Linofilm composition for final output in Russian- 
English translations. Nor was he familiar with the efforts 
expended in the newspaper field where programs of “justi- 
fication and hyphenation” for “hot metal” composition 
were currently under development with the support of such 
major hardware suppliers as IBM, GE, and RCA, And yet, 
without the practical assistance he could have derived from 
broader exposure to the requirements of the printing and 
publishing industry, he and his staff created some very 


| respectable programs which went a long way to arouse 


interest, especially in the scientific community. 

Barnett was inevitably plagued with hardware incom- 
patibilities and his method of producing input to the com- 
puter was cumbersome: eight channel Flexowriter tape to 


‘punched cards by way of a modified IBM 047 paper-tape- 


to-punched-card converter and ultimately into an IBM 
32 K 709 computer. The magnetic tape output from the 
709 was then read at the МІТ. Lincoln Laboratory by 
the IBM 1401 and 1012 attachment to punch eight channel 
paper tape, two frames of which are required to describe 
one character for Photon 560 input. In the early chapters 
of the book he describes the equipment employed and cer- 
tain of the features of the software packages he and his 
staff created. Too much detail is included to hold the 
interest of the casual reader; not enough is provided to 


* Graphic Composing Techniques, RADC No. IDR 61-810, Final 
Report, Contract AF 30 (002)-2537. 





satisfy the curiosity of one who wishes to become intimately 
familiar with the problems and solutions encountered. One 
of the major but unavoidable shortcomings of current pub- 
lications in the field is the difficulty of defining the charac- 
teristics of the readership circle, which may or may not 
include computer people, printers and typesetters, pub- 
lishers, librarians, information-handling specialists, econo- 
mists, and automation sociologists. 

The interest of this reviewer was to discover whether 
Barnett had succeeded їп identifying the same problems we 
at Rocappi had encountered, both with respect to software 
design ада more especially in the vital агећ of man-machine 
relationghips. It is not во much in discovering ways to 
make typesetting machines perform that the challenge 
exists. ЈЕ is rather the development of ап operating system 
which takes into account all of the diverse elements that 
must be brought together for the successful production of 
& book. Bolutions which evolved under Barnett's direction 
are much like those which we employ and, Нке him, we 
came to our techniques more or less in ignorance of what 
others may have developed. One looks for parameters which 
provide useful “hard copy" counterparts and have some 
meaning in the trade. One develops format shortcuts and 
one begins, little by little, to explore the data-processing 
applications wherein typesetting emerges as а by-product 
or an incidental but useful ed result of other computer 
processing activity, or conversely, the data-processing by- 
products which may be derived from typesetting. 

Barnett's group developed two basic types of programs: 
one посете non-fielded input of the "sentential" variety 
and justified it without hyphenation by expanding the 
interword spacing. This is called “TYPRINT.” The other 
accepted information already on punched cards or mag- 
netic tape in fields of fixed or variable format, generating, 
where appropriate, symbols to enable the typesetting ma- 
chine to set the selected material in a variety of type faces 
with appropriate uppercase characters. This is “ТАВ- 
PRINT.” Still another option, “BCDPRINT” was devel- 
oped to facilitate the input of Hollerith cards, and some 
relatively simple but attractive mathematical equations 
were developed from linearized representations, in conjunc- 
tion with J. M. Gerard. One of the most valuable aspects 
of the book is its illustrative material, since it shows not 
only numerous samples of output but the coded input which 
was required in order to elicit that output. 

It is not this reviewer's purpose to appraise Barnett/s 
programs, It is one thing to produce output in a laboratory. 
It is another to arrive at timely, economic, and reliable 

roduction solutions. Barnett’s work was concluded before 

e got to what seems to many to be the heart of the prob- 
lem—namely, the development of reliable and efficient up- 
dating and correction procedures. Without these, photo- 
typesetting by computer is clearly impractical. Work done 
more recently at the University of Pittsburgh (the inheritor 
of the Barnett Photon and programs) has placed a great 
deal more emphasis upon pre-typesetting revisions, although 
their solutions seem presently to be out of the question in 
terms of commercial economics. 

Barnett uses the term “Computer-Aided Typesetting 
Process” ог CATP to describe projects of all varieties in 
this field, and in the latter part of his book he summarizes 
some of the work of others. This summary, as the author 
admits, is far from complete. It suffices to give a flavor of 
the flurry of activity now being carried forward in many 
quarters but it is by no means a definitive compilation of 
such efforts. It virtually ignores the significant activity 
carried on in the commercial sphere by newspapers, com- 
puter hardware manufacturers, printers, typesetters, and 
centers such as our own. C. J. Duncan, of Newcastle Uni- 
versity, seems to have assumed the role of official reporter 
of worldwide developments, and one can also derive a 
great deal of information from the conferences of the 
Research and Engineering Council of the Graphic Arts and 
those of American University in Washington, D. C. It 
appears that Professor Barnett added his brief summary 
merely to give perspective to his own experimental work 
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and perhaps to lay the background for his concluding chap- 
ters which are concerned with “prospects,” but а more 
comprehensive survey and appraisal would have substan- 
tially enhanced this undertaking. 

Chapter 9, “The Economics of Computer-Aided Type- 
setting,” correctly enumerates most of the factors which 
are relevant to a consideration of costs. Unfortunately he 
does not weigh these factors or attempt to assign numbers 
to any of them. Studies have been made on the speed of 
keyboarding raw paper tape, for example, which permit 
actual keystroke costs to be projected. Realistically ap- 
praising the tremendous problem of developing appropriate 
typesetting software, Barnett expresses a strong preference 
for compiler-type programs and recommends against “tight” 
solutions that go to the limit of a computer’s capabilities. 
In this respect 16 seems to this reviewer that he ignores the 
economie realities that compel commercially-oriented ven- 
tures to minimize the cost per thousand input keystrokes 
by striving for high operating speeds and low monthly com- 
puter rentals (with correspondingly small memory and 
storage areas). 

The book is largely impersonal, but occasionally Barnett 
permits his social judgments to come to the fore. "The 
author has found it interesting,” he states, “to try convey- 
ing the experience of participation in the development of 
something new to groups of senior keyboard personnel in 
England. He was left with the impression that these 

ple’s attitudes were conditioned by a feeling of social 
incompatibility with participation in such activities. De- 
spite the fact that, in principle, educational opportunities 
are available in England to children of every economic 
background, there are very strong social factors that limit 
the acceptance of these in a craft or artisan environment, 
and when they are accepted in such an environment, & 
fairly strong attempt is usually made to break away from 
an employment that is associated with it. Interrelated 
factors seem to be involved in the attitude to transition 
from the shop floor to management, which ha xad d only 
rarely in England and when it does, is 
some form of betrayal of principles bcn than as a sts 
gression. England may be particularly unfortunate in this 
respect. The accent of speech in England carries many of 
the barriers to change, passed on from generation to genera- 
tion, which sociologists ascribe to the color of skin in the 
United States.” 

Elsewhere, in discussing programming tactics, he touches 
on & peculiarly vulnerable relation between programmer 
апа manager: “А novice who has just learned to program 
underestimates the complexity of & job in which he wants 
to use а computer, underestimates the work of writing & 
computer program, is ignorant of the existence of extensive 
programming practicalities that are not mupti in program- 
ming courses and are really learned only from protracted 
personal experience, is highly enthusiastic and optimistic 
about a program he decides to write, and strongly opposes 
any attempts to guide or supervisé his work. Quite often 
a situation arises in which a novice in this frame of mind 
is indulged by a manager who has a very casual knowledge 
and who is overawed by the novice’s enthusiasm and 
jargon.” He warns about the “highly personal program 
that may allow its author to exercise a temporary tyranny 
over his employer, but which may also turn into an acute 
embarrassment for the programmer.” He comments about 
the “pernicious” outside influences which beset work of 
technological potential and warns that the “university must 
not be associated ... with exaggerated claims or premature 
attempts at implementation.” In passing, he observes that 
“America’s great advantage is its administrative structure,” 
and “Britain places faith in an oligarchy which is not con- 
strained by checks and balances of the sort which exist in 
the United States.” 

The last overly-brief chapter of “conclusion” points the 
direction new research might take, indicates the probable 
trend of CATP and relates typesetting by computer to 
“information generating procedures,” “the transformation 
of information,” and information retrieval. “CATP’S may 
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well igen the flood of printed pages that are produced 
in the cause of academic tenure, managerial competition, 
ego satisfaction, and the fear of making decisions that seek 
protection in anthropomorphisations of the digital com- 
puter. Nonsense gain authority of the well-typeset 
page in much greater quantities that now... .” 

There is much sense and no nonsense in Barnett’s useful 
book. It is a treatise which will be of value to anyone con- 
cerned with the “knowledge explosion.” Its inadequacies 
stem largely from the fact that we are not dealing with 8 
science with its well-defined frame of reference, so that it is 
not clear to the reader what is strictly pertinent. Here 
technology is opening new vistas. Barnett helps us to see 
some of them at close hand, and with occasional glimpses 
Ei apud to peer somewhat farther down the road toward 

e future. 


Jons W. Биувогр 
President 
ROCAPPI, Inc. 


10/66-3R Mental Health Book Review Index. Vol. 10, 
Whole No, 15. 1965. Compiled by the Editorial Committee 
and Contributing Librarians. Council on Research in Bibli- 
ography, Ine, New York. 78 pp. 


This is the tenth anniversary volume of the Mental 
Health Book Review Indez which now has listed a total 
of over 18,000 references to reviews of 3,288 monographs 

dealing with mental health, & field defined by the editors 
of the Indes вв one which “makes use of a core of knowl- 
edge contributed by the behavioral sciences and amplifies 
it with scientific, professional, and social ramifications and 
applications." The annual volume this year includes around: 

300 titles in alphabetical order by author and gives for 
pede entry three or more signed reviews selected from 
over 200 English language journals. Titles in one year's 
list may be repeated later if three or more additional re- 
views have subsequently appeared. Though the behavioral 
sciences are broadly represented, the emphasis is on the 
peythologiral sciences at the center, for at least one review 
or every book selected must come from a journal in the 
latter area. The intent is to cover important topics fully, 
though not exhaustively. 

The Index is produced through the combined efforts of 
an editorial committee chaired by Librarians Ilse Bry and 
Lois Afflerbach, a group of 38 contributing librarians asso- 
ciated with work in mental health, and a committee of 
specialist consultants. As would be expected from a group: 
of this kind, bibliographic detail is full and clear. 

The uninitiated might perhaps plan to use this compila- 
tion as a book selection tool, but its listings are too late to 
be of value in this respect except in building a retrospective 
collection. Its primary objective, as discussed by Dr. Bry 
and her co-chairmen in a series of editorial prefaces written 
for the annual issues. has been of a quite different nature. 
The aim has been, through a review of reviews, to develop 
& mechanism “for identifying and organizing an evolving 
literature in a new domain of knowledge.” Monographs 
have been thought of as synthesising and interpreting the 
research literature of this new domain and much attention 
has been concentrated, especially in the later years, to its 
multidisciplinary character. 

The editorials contain some very interesting observations. 
Titles have been found to be under critical review for as 
long as five years. At the same time, less than a quarter of 
the books reviewed in the journals selected for indexing 
have received three or more reviews, although some have 
received 25 or more. Thus the titles which have met the 
criteria for listing in the Index have been given a collective 
evaluation by the scientific community. The evaluation 
serves not only as an assessment of research in the be- 
havioral sciences, but вв а deviee for showing by their 
absence from the review record important areas which need 
to be brought into focus. 





! 








There are judicious words on the subject of bibliographic 

le-as a means of scholarly communication, on pro T 
bibliography as & means.for allowing & maximum n 
of correct and significant inferences from a minimum of 
words, and on the futility of attempting to control the 

iterature crisis by ‘the indiscriminate listing of all the 
publications in a given field rather than by separation of 
lone is worth retrieving from what is not. 

| The thoughtful, scholarly approach to bibliographical 

probleme brought out in its editorials deserves a wider 

udience than the special subject area and format of the 
ndeg customarily draws. The latest discussion, for exam- 
ple, is devoted to the need for an entirely new kind of 
lelassification for organizing the behavioral science litera- 
ture. The evolution of the concept of the sciences of man 
in the 16th century into the behavioral sciences of the 
20th century is traced ды parallel with the development of 
classification schemes during the same periods. Two diver- 
gent trends in classification are noted: the anthropocentric, 
in which subjects are grouped in the sequence in which 
man might reasonably pursue them in an ordered search 
for knowledge, and the anthropotropic, in which man turns 
to himself to study the nature of man. The major library 
classification schemes are anthropocentric, but the behavioral 
sciences are essentially anthropotropic; hence the impasse 
in attempting to relate them satisfactorily with one another 
and with the collection as a whole. The editors list three 
major problems: (1) the need to transfer psychology from 
its 19th century position as part of philosophy as in the 
Dewey Decimal and Library of Congress classifications, 
(2) the integration of psychology and psychiatry now sepa- 
_tated in all general classifications except Bliss, and (3) 
determination of whether psychology can be confined within 
a class at all. In actual fact, the two problems would 
disappear if the third could be solved. The solution pro- 
posed is an open orbital system for the behavioral science 
literature rather than the traditional linear scheme. The 
psychological sciences (psychology, psychoanalysis and psy- 
chiatry) form the center of the orbit and thus integrate the 
anthropocentric and anthropotropic approaches. Other fields 
find their place in the orbit in accordance with their degree 
of interaction with the psychological sciences. New fields 
and new interdisciplinary areas can obviously be accom- 
modated at will in such a system. The scope of the system 
is suggested as “knowledge about behavior in its roots and 
manifestations, in man and animals, in individuals, groups, 
and culture, and in all conditions, normal, exceptional, and 
pathological.” 

The proposal is certainly an interesting one, but certain 
of its assumptions require elaboration. Though the need to 
delimit the behavioral science literature from the total 
literature of man is postulated, the definition of areas of 
interaction is so broad that precision in separating one from 
the other would be difficult indeed to obtain. The orbital 
idea might function very well in a system designed for users 
primarily from the psychological .area, but the practicality 
of combining orbital and linear schemes in one general 
bibliographic system which would, for example, in the area 
of neurochemistry give equal satisfaction to the psychologist, 
neurologist, and chemist is not so evident. The editors go 
on to state that the Indes has for the past ten years been 
experimenting with the orbital organization of the behav- 
ioral science literature. This statement also needs further 
explanation, though the implication is that the manner in 
which reviews are selected would automatically insure an 
orbital pattern if the reviews were rearranged by subject 
areas instead of by author as at present. 

This volume of the Mental Health Book Review Indez, 
like it predecessors, is its own best witness that it has been 
compiled with careful thought for the time and people it 
serves, 


Loviss Darina 
Biomedical Library 
University о) California, Los Angeles 


10/66-4R Mining, Minerals, and Geosciences. Volume 
2 of Guides to Information Sources in Science and Tech- 
nology. 1965. Stuart R. Kaplan. Interscience Publishers, 
a division of John Wiley & Sons, New York City, 599 pp. 


The aim of this book — to provide a comprehensive guide 
to continuing sources of information in the fields of metallic 
and non-metallic mining, metals, fuels, minerals, geology, 
geophysics, beneficiation and processing, geography, and the 
broad area of pure and applied earth sciences—is good; 
the result: somewhat less than a bull’s eye, but useful 
nevertheless, 

Part I, 445 pages, lists organizations which are sources of 
information about one or more of the broad subject areas 
covered. These sources are arranged by nine major geo- 
graphical areas, starting with international organizations 
and then proceeding to North America, Central America | 
and Caribbean Islands, South America, Europe, Africa, 
Middle East, Asia, and Oceania. Listed within each of 
these geographical areas are the organisation name, address, 
telephone number, telegraph and cable address, brief desorip- 
tion of purpose and functions, year organized, organiza- 
tional structure, divisions, departmenta and sections, and 
their function, regional, branch, and district offices and 
addresses, membership, and publications. This information 
appears in alphabetical order by organizations, under each 
eountry: in the geographical areas, except that Canada, ` 
United States, and United Kingdom are further subdivided 
into more manageable units such as government agencies; 
scientifc organizations, institutes, and associations; and 
states or provinces. 

Part П of the book lists in 116 pages the published litera- 
ture in the fields of geography, geology, geophysics and 
geochemistry, glaciology, lapidary, mineralogy, mining and 
metallurgy, oceanography, paleontology, photogrammetry, 
physics, science, seismology, soil science, speleology, ce- 
ramics, coal, gas, iron and steel, and petroleum. Within 


' these major subject areas the literature is classified first 


by types, ie. abstracts, bibliographies, dictionaries, direc- 


` tories, handbooks, or journals, and within each of those 


types the publications are subdivided by geographical areas. 

'This directory of organizations can be useful for seekers 
of contacts in various parts of the world. For example, this 
reviewer is scheduled to pass through Calcutta, India, in 
connection with a visit to а dam project, and was interested 
to learn (рр. 393—394) that the Geological Survey of India, 
located in that city, is likely to have information on engi- 
neering geology for the study of dam sites— perhaps this 
very one under study. 

Recognizing that chances for incompleteness of informa- 
tion are good, the book contains a tear sheet for the reader 
to make recommendations, omissions, and corrections which 
сап be mailed to the publisher to the attention of the editor. 
In perusing the volume, I found several places where the 
information furnished either lacked being complete or was 
entirely omitted. For example, under GEOPHYSICS AND 
GEOCHEMISTRY, journals, USSR, the first item is: 


GEOHIMIJA 
Geohimija, Vorobyevskoye shosse 47—a, Moscow, USSR. 


This is considerably less informative than can be found in 
entry 689 of a previous publication, A Guide to the World's 
Abstracting and Indexing Services in Science and Tech- 
nology, Report No. 108, National Federation of Sctence 
Abstracting and Indexing Services, Washington, D. C., 1908, 
published under a grant from the National Science Founda- 
tion. This entry is: 
GEOKHIMIYA 


Akademiya Nauk USSR; order from “Akademkniga,” 

Pushkinskaya 23, Moscow K-104 USSR Monthly; since 

~ 1956; 75 abstracts and 25 references a year to world 
literature; 9 rubles. geochemistry 
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The absence of duplicate entries or at least cross refer- 
ences also creates lack of clarity insofar as abstracts in 
geophysics and geochemistry are concerned. For example, 
on page 478, only one publication is listed under the cate- 

огу of abstracts, ie. Geophysical Abstracts (U. S. Geo- 
о Survey) in the field; GEOPHYSICS AND GEO- 
CHEMISTRY. However, on page 467 under GEOLOGY 
in the category of abstracts, Geoscience Abstracts is listed. 
The description of the latter publication states that it in- 
cludes material in geophysics and geochemistry. Thus, if 
someone is interested in locating all the abstract publica- 
tions covering geophysics, he would need to look in several 
places for the information. 

Stil on the subject of geophysics, this reviewer looked 
for an entry describing the abstract journal in geophysics 
prepared by VINITI (the institute for scientific and tech- 
nical information) which he visited in Moscow as part of 
a UNESCO team in 1963. It was not listed in the book, 
even though it appears as item 1418 of the publication 
referred to previously. The reference is: 


Referatiunys Zhurnal: Geofizika, monthly; since 1957; 
15,500 abstracts a year from world literature; annual 
author and subject indexes; 27 rubles, 60 kopecks (for 
organizations), 17 rubles, 25 kopecks (for individuals). 


This small sampling which is limited to publications of 


. abstracts may not be indicative, but it seems to the re- 


viewer that a list of sources of information that purports 
to be comprehensive should include pertinent material 
available in published form elsewhere. 

Three indexes are provided: (1) an Index of Geographical 
Areas, (2) an Index of Literature, and (3) an Index of 
Organizations. These add substantially to the accessibility 
of the information to be found in this volume, but the 


. previously mentioned omission of “geophysics” as a subject 


of Geoscience Abstracts is consistent with its omission in 
the Index of Literature. 

It is hoped that the first planned revision of this reference 
work to keep it up-to-date will also result in a more com- 
prehensive volume. 


Jack W. Hur 
Water Research Scientist 
U. S. Dept. of the Interior 


10/66-5R Library Science Today: Ranganathan Fest- 
schrift Vol, 11. 1965. Edited by Kaula Prithvi Nath. 
Ranganathan Series in Library Science, 14. Asia Publishing 
House, New York, 832 pp. $30.00. 


The second volume of this Festschrift is a Ranganathan 
bibliography compiled by A. K. Das Gupta. This volume 
contains 117 signed articles by 109 authors (some of them 
are only joint authors), some of whom wrote two or more 
papers. The authors are from the following countries, 
among others: Australia, Austria, Germany, Great Britain, 
India, Japan, Nepal, Netherlands, Norway, Pakistan, Switz- 
erland, Thailand, and the U. 8. : 

Among U. S. authors are James B. Childs, Robert B. 
Downs, Theodore A. Mueller, Jesse Shera (with James W. 
Perry), Ralph R. Shaw, Louis Shores, Maurice F. Tauber, 
and Lawrence S. Thompson. 

The main groupings are: classification (general), colon 
classification, faceted classification, cataloguing (general), 
cataloguing in Japan, subject cataloguing, documentation, 
laws of library science, librarianship, library movement, 
library organization, university libraries, library administra- 
tion, reference tools, social education, library education, 
evaluation (works), evaluation (works and life), evaluation 
(life), and reminiscences, Appendixes contain the list of 
members of the Ranganathan Commemoration Volume 


Committee, a list (with identification) of authors of papers, | 


a chronology of Dr. Ranganathan’s life, and the Committee's 
appeal for contributions. A helpful index is included. 
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While many of the articles deal with other subjects, the 
pt majority of them concern Dr. than’s life and 
place in librarianship. Much of the basic biographical 
material is repeated many times by various authors. It is 
generally asserted that his contributions have been very 


. great, frequently with a touch of what one suspects reflects 


national pride. One wonders about statements such as the 
following: “He is the international figure in the Libra: 
World" (p. 532) ; “the Master-Architect in Library Science” 
(p. 550) ; “the Father of Library Education" and “a multi- 
faceted genius in Library Science" (p. 554). 

On the other hand, Americans and others have joined in 
giving a high evaluation of Dr. Ranganathan’s contribu- 
tions. Shera and Perry claim that he “found librarianship 
little more than a bundle of techniques, a rather simple 
technology, and he, and his followers, have raised it to an 
intellectual discipline in its own right” (p. 45). 

Again, Coblans asserts: "itis Ranganathan who made the 
first break-through in classification theory which provides 
possible structural techniques for handling scientific infor- 
mation” (p. 281). 

Shaw calls him “a truly great teacher’ (p. 63), and 
evaluates colon classification as “a milestone in the develop- 
ment of the intellectual process involved in the o ization 
of the information that appears in recorded form" (p. 68). 

Vickery thinks that Dr. Ranganathan’s “scientific approach 
to classification” is “his most enduring contribution to 
librarianship” (p. 109). 

Despite much repetition, this book is a contribution to 
library literature. It tells us about an онал figure 
in librarianship, and its Festschrift character has brought 
in several contributions on other subjects which will be . 
missed if one regards the book as devoted solely to Dr. 
Ranganathan. А few of these may be mentioned: cata- 
loguing in Japan, eorporate author entry for the German 
Federal Republic, government and official publications in 
the German Democratic Republic, copying the old catalogue 
of the Austrian National Library, books for Norwegian 
seamen, bookmobile service in Hawaii, Farmington Plan 
for Pakistan, the future of university libraries (Downs), 
university library building planning, the libraries of the 
UN, encyclopedia making, printing and printing collections 
in Kentucky, early history of European periodicals, and 
books for children апа youth in the German Democratic 
Republic. 

Тһе book is printed in а satisfactory format, but on 8 
poor quality paper, and is not well-bound. A good many 
typographical errors got past the proofreaders, А large 
number of photographs add to the book’s usefulness. The 
price of $30.00 seems quite high. 


Dr. Lorarr Н. Evans 

Director, International and Legal 
Collections 

Columbia University 


10/66-6R Use of Mechanized Methods in Documenta- 
tion Work. 1966. Herbert Coblans, ASLIB, London. 89 pp. 


This is an excellent tutorial and critical review of the 
state of mechanization in libraries, documentation centers, 
and other information handling activities. From its intro- 
duction on, it emphasizes the difference between what it 
calls the “whole hog” approach to mechanization and the 
“housekeeping” approach —the one trying to put every- 
thing into the computer, the other adopting a less mecha- 
nised mix. The report consists of an introduction, then 
three parts covering different areas of mechanization, and 
finally, three appendices. 

Part I emphasizes the more or Jess clerical areas of 
mechanization — the housekeeping functions such as catalog 
production, serial records maintenance, circulation, and 
acquisition accounting and control. Part I concludes with 
a discussion, pro and con, of the role of mechanization in 


' 
these areas. This discussion can be summarized as follows: 
The principal advantage of such mechanization in the 
likelihood of improved efficiency and effectiveness, but it 
ig hard to confirm just how much there will be because 
staff i is transferred to other service activities, present costs 
are not well established and are not really comparable any- 
way, the rigid limitations imposed by mechanized opera- 
ons change some of the qualitative characteristica of oper- 
ation. In general, however, the evaluation is a positive one. 
Part П discusses reference and document retrieval (with 
us prototypieal example being Medlars). It reviews several 
bf the approaches to it — hardware, software, and total sys- 
tems — including permuted indexes, image storage, TIP, 
БОТ, etc. It also concludes with an evaluative discussion, 
ased primarily on Medlars. Since both NLM and its 
erimental sub-centers at Colorado and UCLA have 
presented a continuing qualitative and quantitative evalua- 
tion of this system, it constitutes an excellent basis for 
operational evaluation. (The recent article* in the Bulletin 
of the Medical Library Association is a model for anyone 
reporting on such systems.) 


Part ПІ discusses the application of computers to data 


retrieval, but so briefly that its value lies solely in having 
recognized that data banks exist. 

Appendix 1 is a relatively up-to-date bibliography; Ap- 
pendix 2 describes existing “non-conventional” systems in 
the United Kingdom, and Appendix 3 is a brief listing of 





| 
|| 





some equipment avalable in the United Kingdom. There 
ін an index, primarily to names. 


Dr. Ковевт M. Haves 

Professor 

School of Library Service 

University of California at 
Los Angeles 


* Rogers, Frank B., “MEDLARS operating experience at the Uni- 
verstiy of Colorado,” Bulletin of the Medical Library Association, 
$4(1):1-10, Jan. 1988. 


1966 WESCON Technical Papers 
Los Angeles, August 23—26, 1966 
WESTERN PERIODICALS CO. 


Exclusive Distributor 
Volume 10—Complete Set—6 Parts—91 Papers Plus Per- 
muted Index— $56.00 
Part 1 Antennas, Microwaves Communication 
Sessions 7, 12, 19—14 papers 
tegrated Circuits 
Sessions 5, 10, 20—16 papers 
Part 3 Electron Devices and Packaging 
Sessions 3, 8, 18—14 papers 
Part 4 Computers and Data Processing 
Sessions 1, 21—10 papers 
Part 5 Systems, Space Electronics 


Part 2 


Sessions 4, 9, 24—14 papers 
Part 6 Instrumentation, Electronic Systems, 
Components 
Sessions 11, 14, 16, 22, 23—23 papers 
Permuted Index to WESCON Papers 
Volumes 1—10, 1957 thru 1966 


(Microfiche offer on one-to-one basis for 
each complete set purchased $20.00) 


Also Available: 
IEC Packaging Symposium 
Los Angeles, August 22-23, 1966 
SAVE Symposium 
Los Angeles, August 22-23, 1966 


WESTERN PERIODICALS CO. 


13000 Raymer Street 875-0555 
North Hollywood, California 
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ADI Chapters and Chapter Secretaries 


CENTRAL OHIO CHAPTER 
Mrs. Arleen N. Somerville 
Chemical Abstracts Service 
Ohio State University 
Columbus, Ohio 43210 
614-293-6586 


CHICAGO CHAPTER 
Mrs. Barbara N. Yanick 
Librarian, NALCO Chemical Co. 
6216 W. 66th Place 
Chicago, Ill. 60638 
312-PO7-7240, ext. 472 


DELAWARE VALLEY CHAPTER 
Miss Nellie А. Medzadour 
Wyeth Laboratories 
Box 8299 
Philadelphia, Pa. 19101 
215-MU8-4400, ext. 678 


INDIANA CHAPTER 
Mr. Asa N. Stevens 
6157 E. St. Joseph Street 
Indianapolis, Indiana 46219 
317-357-6460 


LOS ANGELES CHAPTER 
Myra Grenier 
Aerojet General Corp. 
P.O.Box 296 
Azusa, Calif. 
213-334-6211, ext. 5166 


AETROPOLITAN NEW YORK CHAPTER 
Miss Nanette Farley 
T. J. Watson Research Center 
IBM Corp. 
P.O. Box 218 
Yorktown Heights, New York 
914-WG5-2037 


NEW ENGLAND CHAPTER 
" Miss Virginia Valeri 
Arthur D. Little, Inc. 
15 Acorn Park 
Cambridge, Mass. 
617-864-5770 


224 American Documentation — October 1966 


NORTHERN OHIO (CLEVELAND) CHAPTER 
Miss Helen Skowronska 
Sherwin-Williams Co. 
P.O. Box 6027 
Cleveland, Ohio 44101 
216-ТО1-7000 


PITTSBURGH CHAPTER 
Mr. James Brandi 
ALCOA Research Labs. 
Box 772 
New Kensington, Pa. 
412-337-6541 


POTOMAC VALLEY CHAPTER 
Mrs. Mary Herner 
Herner & Co. 
2431 K Street, N.W. 
Washington, D.C. 20037 
202-965-3100 


SAN FRANCISCO CHAPTER 
Miss Marilyn Johnson 
Shell Development Corp. 
1400 53rd Street 
Emeryville, Calif. 94608 
415-OL3-2100 


SOUTHERN OHIO CHAPTER 
Miss Marie L. Koeker 
2445 Fairport Avenue (home) 
Dayton, Ohio 45406 
513-235-3419 (office) 


SOUTH TEXAS CHAPTER 
Mr. Doug Yauger 
7107 Augustine 
Houston, Texas 
713-PR4-1269 


UPSTATE NEW YORK CHAPTER 
Mr. Herbert Ohlman 
Xerox Corp. 
Box 1540 
Rochester, М.Ү. 14603 
716-TR2-2000, ext. 22158 
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Volume 16 and Volume 17 (Nos. 1, 2,3 only) 


This is a deop, informative, subjective index to ideas and concepts contained in papers, brief commu- 
Реа оп and letters to the editor, produced with the ІВМ Selectric typewriter. 


Authors and.subject headings are in one alphabetical listing. 
less underlined, in which case they refer to Vol. 17. 
titles of reviewed books are in different type faces. 


|р тас Ез; comparison with words 
in titles 30 
bstracts, in indexing 
bstracts, telegraphic 
ссө55 time to information, 
formula of 164 
cronyms in indexing 24 
Adaptive information dissemination 
185 
ADI Meeting of 1964, comments on 
05 


19 


1 

ALBUM Н.П. 313 : 
l"Alphabetical Subjact Indication 
of Information," by J, Metcalfe, 
150 

Amarican Petroleum Institute 
Information Retrieval Project. 
Subject Authority List, 127 
ANDERSON А.Ю. 185 

Answers and questions, matching 
of 26 








Antonyms in indexing 
ARTAHDI 5. 334 
Arthur D, Little Report on 
"Centralization and Documentation" 
70 

ASH Le 47 

Aslib-Cranfield Studies 69 
Aslib-Cranfield Test of W.R.U, 
Metallurgical Indexing System 73 
Associative document retrieval 5 
ASTM Bulletin 147? 


24 


| As we may think 115 


ATHERTON Р, 186 

ATHERTON Р. 127 

Author forums іп 1964 ADI Meoting 
107 

Author indexing in AD 56 

Author indexing, effects of 
discipline 309 

Author indexing, quality of 311 
Automatic indexing 5 

"Automatic Indexing: А State-of- 
the-Art Report,” by M,E, Stevens 
334 

Bar-Hillel, J, "Language and 
Information,” 127 

“The Baseball Program: An Automatic 
Queation-Answar, Ву А.К. Wolf, 
C.S, Chomsky and В.Е, Green Jr. 39 
BASEBALL system of Green 175 

BAUM Н, 28 

BERNIER CT. 323 р 
bíbliographic coupling and subject 
indexing 223 

biomedical author-indexing form 
303 


BIRCH R.L. 46, 147 
BISHOP D. 115 ^ — 
BOHNERT І.Н. 36 
BOHNERT L. 334 


book catalogs 48 

book catalogs and computers 86 

Boolean algebra and information 

syatema 340 

1 logic and term weights 
45 

BOURNE C.P. 529 

BRANDIORST М.Т. 124, 145 

CARLSON W.S. 57 

"Catalog Card Reproduction; Report 

on a Study by С, Fry et al, 243 

çatalog production and computers 
1 


"Centralization and Documentation" 
Arthur D. Little Report on 70 
central store concept of Nat. 
Info, Center 314 


Chemical Abstracts KWIC Index 122 


Chemical-Biological Coordination 
Center, Dougherty dissertation on 
79 

Chemical Titlas and Chemical 
Abstracts 540 

Chemical Titles in SDI system 197 
CHEYDLEUR В.Е. 171 
circulation data analysis 21, 
circulation requirements, user 
citation analysis, matrix for 


citation behavior, patterns of 179 
citation indezee and authors’ 
initials 148 

Citation indexing, improvement of 


selectivity in 81 

Citation relationship indicators 
for improving citation indexes 
84-87 

"Citations of Physics Literature" 
by Lipetz 242 

citations and the social system of 
science 181 . 

Clapp V.W. "Tha Future of the 
Research Library." 334 
classification and computers 248 
classification and indexing 67 
clearinghouse for scientific and 
technical meetings 28 
Clumping of index words 
olumping techniques and 
348 

COBLANS H, 67 

Cohen and Craven study on Science 
Information Personnel 80 

COMMITTEE ON ORGANIZATION OF 
INFORM, 235 

Computerized serials systems 121 
computers and book catalogs 86 
computers and olassifioation 248 
computers and commercial publishing 
76 


81, 


5,6 
relevance 


24 


Computers and library catalogs 
conferences, congresses, and 
information 162 


consistency of indexers 33 


Page numbers refer to pages in Vol. 16, 
Items from letters to editors and authors and 
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